Down to the Wire

Hacking from the Pool: A DEF CON 2021 Retrospective

Much like the rest of the world, DEF CON CTF returned this year in a hybrid online/in-person format. For those who wanted it, space was reserved on the game floor to hack amidst the other teams that came to Vegas. For the rest of us who were still a bit nervous about large crowds, the infrastructure would be hosted online and accessible from anywhere in the world. Torn between the two choices, we opted this year for a middle ground: all of us together, but in a house 300 miles away.

Introduction

Hello, Goodbye

DEF CON CTF is one of the most well known security competitions in the world. Often considered the World Finals (or “Olympics”) of Hacking, teams qualify for it either by winning other notable competitions, or by placing high enough in its dedicated qualifier round.

As a consequence of the competition’s longevity, and the onerous burden of running it, the organizing team ends up changing every couple of years. For the past four years, the CTF has been led by the Order of the Overflow, who have announced that this year’s competition has been their last. Out of consideration for their hard work, and all of the new ideas that they’ve brought to the table, after I finish my retrospective I would like to take a moment to discuss their legacy and what future organizers can learn from their tenure (as viewed by a competitor).

One Final Hill to Climb

With that said, it is helpful to understand the competition that they ran before discussing what happened during it. This game was not much different from previous years, but to save you the trouble of looking at another writeup, I’ll summarize the structure here.

DEF CON CTF takes place over three days and includes two different types of challenges. The most well known of these are Attack/Defense (A/D) challenges. Competitors can earn points for these in two ways; they score by attacking other teams and by preventing teams from attacking them. The scoring structure is such that attack points are earned for every team that they score against in a given 5 minute round, and defense points are earned when no team attacks them in a round (where at least one successful attack was launched).

Attacks may be launched directly against a team, in which case the network data will show evidence of the attack, or in “stealth” for full anonymity, but only half the points. Additionally, these services may be modified in such a way that functionality is preserved, but attacks no longer land. This process is called patching, and is the main way that defense points are earned.

As mentioned, there is another type of challenge called King of the Hill (KotH). These can be better thought of as “games within the game” and accumulate points differently.

In a King of the Hill problem, a separate scoreboard is maintained alongside the CTF’s scoreboard. Competitors can increase their rank on the scoreboard by meeting the criteria of the problem. For instance, a couple of years back there was a problem in which players had to write shellcode that ran under as many architectures as possible. The team in 1st place was the one who could run against the most arches, and the team in last was the one who could run against the fewest. This problem-specific scoreboard would then be used by the main scoreboard at the end of every round to award points to the top 5 teams based off of their position.

#HackerHouse

Preparation is a standard part of the leadup to DEF CON, and normally involves updating and improving our tooling for the new year. This year, however, a different type of prep was needed as well. Since we had made the decision to not play in Vegas this year, we needed to find a rental that could fit the whole team (about 20 people) and have enough bandwidth to let us play.

After a lot of searching, we settled on a spacious rental in Scottsdale, Arizona — ironically only 30 minutes from where much of the Order lives. In some ways, this was the best part of the CTF. Having not seen each other in person for two years, It was a joy getting the chance to hang out with everyone again.

group photo

Naturally, we also had plenty of technical prep as well. There was not anything super exciting this year, but some of the highlights include:

  1. A collaborative Ghidra server for better communication while reversing. The system Ghidra uses is vaguely git-esque, in that it doesn’t do any realtime synchronization, but instead lets users checkout problems, push their changes, and merge other changes into theirs. Thanks to a clever patch from @nneonneo, we even had public key support for authentication which allowed everyone on the team to easily use the system.
  2. Discord bots for organization and synchronization. We ended up developing two discord bots that worked in tandem. The first of these served as a game state interface for discord. It would monitor the game’s API and automatically create and update channels for every problem. The second bot maintained a list of tasks that we needed to work on, and provided an easy interface for people to assign themselves to these tasks.
  3. Better visualizations for the game state. Most of this development actually happened during the competition, but this included nicer visualizations of the KotH scoreboards, attack matrices for each problem, and the ability to look at historical data from any round.

Competition Start

Day 1

One of the difficulties in doing a writeup for the whole competition is that most of the time you are only focused on a handful of the problems and the rest are simply background noise. This year, I almost exclusively played the KotH problems, and have far less context on the A/D ones. With that said, however, I’ll do my best to give a short summary of all the problems, and try to explain how we solved them (or didn’t, as the case may be). If you read something that seems incorrect or misleading, please let me know! There’s a good chance I misunderstood something I heard from my teammates.

A False Start

Unfortunately, before we can get into the competition start, we need to mention an unfortunate mishap that immediately preceded the competition. As mentioned previously, we employ several tools designed to monitor and scrape the game state data in order to populate internal visualizations and metrics. After the network became available, one of these tools began reporting the existence of many problems: eight, if memory serves. Furthermore, they all had download links attached to them with (what we later learned to be) work-in-progress versions of the challenges.

After a brief internal discussion we decided that the only responsible option was to report this to the Order and let them decide how it should be handled. They in turn asked us to keep from looking at the problems for half an hour while they determined the best course of action.

Ultimately they decided that the cat was out of the bag, and that they needed to release this data to everyone equally. This was somewhat of an unfortunate decision for us, since as a smaller team (at least among the top 3) we don’t have the bandwidth to be looking at more than a couple of problems at a time.

With that said, however, I believe that OOO absolutely made the right decision. It is terribly unfortunate that this happened, but that does not change the fact that it did. While I trust that we would have abided by any decision that they made, I also know that it would have only sowed distrust (both toward us and toward other teams who may have been suspected of also acquiring the resources).

In the end, the playing field was leveled, and we were able to start the competition, albeit a bit late.

Zero is You

The first problem released was a KotH problem called “Zero is You.” As the name suggests, it is largely based off of the indie puzzle game “Baba is You.” For full disclosure, PPP has a pretty major love affair with Baba. During PlaidCTF two years ago, we spent several hours playing the game together in between problem releases and support.

Given that, let me say that “Zero is You” is undoubtedly the most fun problem I have ever played at DEF CON finals, and is easily in my top-5 CTF problems full-stop. It took the form of a puzzle game wherein shellcode could be run from within the game, as emulated by unicorn. In this manner it was quite different from most game-based CTF problems because the problem was not embedded into the game, but was the game itself.

Honestly, I could spend the entire rest of this writeup explaining the game and our solutions. Instead, I’ll leave you with a link to our solutions for all the problems, a link to play it yourself, and a video of one of my favorite puzzles.

After about 2 hours or so, we had acquired a pretty substantial lead on this problem that we were able to maintain for the rest of the day. Furthermore, since the score was based not just off of how many levels you solved, but the efficiency of those solutions, one of our team members set up a fuzzer in the background that would transparently optimize our solutions to give us a slight edge on the competition.

ooopf

ooopf at first glance allows users to compile & execute a program in a custom language specific to the problem. Or at least, that’s what we thought during the competition. As it turns out, the “custom language” was itself just BPF.

As a consequence of us not noticing that it was BPF, most of our efforts were spent reverse engineering the interpreter. Before we had even had a chance to bug hunt, we were already being hit with exploits over the wire. Although we did not yet have the tools to understand them, several of the exploits were unprotected and we could simply rip them off the network and begin launching them at the rest of the teams. It’s a bit disappointing, especially in retrospect, but points are points 🤷.

I should also briefly mention the mishap in our patching that nearly disqualified us. I go into detail about why it happened later, but the gist is that in not understanding the problem very well, one member of our team attempted what they thought was a clever side-channel patch. Unfortunately, this member did not understand that their patch was both against the rules, and broke the service’s functionality. I want to be sure to express my gratitude toward the OOO for their patience in helping us address this issue, and to apologize for the disruption of the game.

Barb Metal

You know what just makes sense? The internet of things running on Ruby. That was the premise for Barb Metal, a Ruby byte-code interpreter whose accompanying executable seemed to emulate a small IoT device with hardware such as a speaker and a thermometer. You could interface with it via plain-text commands such as SPEAKER queue my-song or THERM read day 5.

The actual interpreter was a binary that used mrubyc to interpret the bytecode, while also adding some additional functionality via the foreign function interface (ffi). These globals were available to the whole program, and emulated some of the hardware functionality that might have been present on a real device.

All told, we used 3 separate bugs: 2 for attacking and 1 for defending.

  1. In the first bug, it was possible to do an out of bounds read on the temperature sensor. Since the flag was in nearby memory, you could read the flag this way.
  2. The second bug was a buffer overflow in the name field on the song list used by the speaker. When you queued a song with a long enough name, voted for it to make it most popular, then moved it to the front of the queue, a buffer-reuse bug would give you access to the flag.
  3. Finally, we found a bug in the signature check that allowed us to recompile the code (by first manually decompiling it!) and load it in as the new runtime.

With these bugs, Barb Metal was one of our most successful problems on the first day, earning us a comfortable number of both attack and defense points.

ooows-flag-baby

Flag Baby represented the first of 5 challenges in the ooows series. These were a fairly unique brand of problem wherein users uploaded virtual machines that would be run using kvm. The framework provided a basic web interface for managing these VMs, as well as a websocket-based protocol for interfacing with the terminal and the monitor.

Additionally, each of the VMs would be started with a number of virtual devices attached. Common to all of the problems were

  1. ooowsdisk.py: A disk driver for handling file read/writes
  2. ooowsserial.py: A utility driver for interfacing with the serial port
  3. vga: A video driver

Additionally, each problem had a fourth driver that would be the target of that problem’s exploitation. This required team to write their own lightweight kernel that would talk to the driver in order to exploit the bugs found within.

In the first of these challenges, ooows-flag-baby, most of the difficulty was in this infrastructure piece. The “device” was just a shell script that let you read a file from the host machine, as long as the name was not FLAG.

Since this problem was mostly focused on getting familiar with the setup, the actual exploit was pretty simple: One of the lines in the driver contained eval '_LEN=${#'$2'}', with the parameter “$2” user-controlled. Since this is a clear shell injection, one needed only to call the relevant function with a short shell script that sends the flag to a predictable location.

As a consequence of the low-complexity exploit, this problem was retired fairly quickly, leaving ooows-p92021 as its successor.

ooows-p92021

Plan 9 from OOO

The second device driver implemented was one for an implementation of the plan 9 filesystem (loosely modeled after 9p2000, it would seem). Once you were able to talk to the driver, you could make standard file system operations such as reading/writing files, creating/deleting directories, and so forth. The flag itself was found in /flag, however the driver prevented you from reading anything in /. Thus, a bug in the driver was needed to actually solve the problem.

Unfortunately, despite getting the infrastructure to interface with the device running quickly, we were unable to locate the bug needed to solve this problem. Fortunately, after seeing the network data from flag-baby, our defense team had begun adding in analysis tools for better extracting network traffic from the ooows problems. Thus, we were eventually able to deconstruct another team’s exploit.

The short version is that the walk function (which lets you traverse a filesystem tree path by path) has a use after free when instructed to walk an empty tree. Under the right circumstances, this can give you a fid whose structure is on the heap but has been freed.

Once we understood this, we were able to patch it and begin throwing the exploit against other teams.

Day 2

www

Day 2 brought with it a new King of the Hill, entitled WWW. As one of PPP’s resident web guys™, I was pretty excited for a web-based King of the Hill. Unfortunately, that was not exactly what OOO had in mind.

WWW actually stood for the Wild Wild West, and it was a totally self-contained network with one box for each team. The rules were a bit complex, but followed approximately like this.

  1. Every box on the network produced 1 flag every round (tick).
  2. Any team with a copy of that flag could redeem it on that tick for a piece of graffiti.
  3. Once you had graffiti, you could spray it on teams’ “walls”: services that anyone could write to and read.
  4. At the end of the tick, you scored one point per piece of graffiti per wall.
  5. You could also “accuse” someone of spraying graffiti by supplying the graffiti and the IP address of their primary box. You could only do this once per graffiti, but if you were correct they would lose 5 points, far more than they made.
  6. Pcaps for each machine were stored in a known location and were updated every few minutes.

Additionally, new machines were started on the network at random, running vulnerable services that could be exploited for control.

The intent of the problem was to attack other machines, steal their flags, and use them to spray graffiti while covering your tracks. We… did not do that.

As it turned out, one of those rules was a lie. Specifically, “Any team with a copy of that flag could redeem it on that round for a piece of graffiti.” Before we really had a good grasp of the rules (we might have, uh, skimmed), we found that old flags could still be used to generate new graffiti. Consequently, we threw together a hacky bash script that would save all of the flags and then redeem them each round for fresh graffiti to indiscriminately spray everywhere. As a result, our score increased quadratically with each round.

In tandem with this, I wrote a (slightly less hacky) script that would pull down all of our own PCAPs, parse the graffiti spraying from them, and then accuse all the teams found in it. After about an hour we had given ourselves a pretty sizable lead, at which point we noticed that we were actively being accused.

Given the major losses incurred from being accused, we temporarily halted our efforts to figure out a solution. Fortunately, another member of our team had been mass-scanning the network, and using credentials found in a file local to our main machine, was able to log into a different machine on the network. After securing it a bit, we tried to determine whose it was. We located a public key on the machine with a name attached to it, and realized that the box must have belonged to Pasten!

Once we had this machine, we could direct our traffic through it such that all of the accusations against us failed. Ultimately we ended up acquiring a few more boxes with this insecure password, but with such a hefty lead on our opponents, we found ourselves doing very little actual work on this problem until it was retired.

Oddly enough, I did not really understand that we had misplayed the problem until after the competition when someone from Samurai mentioned that they were pwning 10 boxes a round (more than the most we ever controlled). It was at this point we realized that the use of historical flags had not been intended, and we were fortunate to have noticed that early.

ooows-ogx

The next problem in the ooows quintology was ogx, a play on the sgx extensions from Intel.

Using Intel memory protection keys, the ogx device let the guest operating system run code in a secure enclave such that no memory outside of the enclave could be accessed by the code. When sending code to the driver for execution, it would filter out any syscalls and the wrpkru instruction (which set the protection key register, thus potentially disabling the enclave).

However, in order to create the enclave in the first place, the driver used a small code segment that itself invoked wrpkru and then jumped to user code. By jumping back into this stager, we were able to invoke wrpkru and disable the enclave.

However, as mentioned in previous overviews of the problem, an easy exploit does not necessarily translate into easy flags. The biggest difficulty we ran into (other than straight reversing) was the interface with the driver using VIRTIO. As would be true in other problems, this overhead to solving the problem slowed us down considerably.

It is worth noting that this exploit was different from the one that other teams were using, and could not be patched. Thus once we began throwing it, no teams were able to earn defense points.

ooows-broadcooom

After ogx came broadcooom, the ooows problem we likely spent the most time on. This one presented a network interface that loaded a custom firmware and interpreted it. Consequently, there was a tremendous amount of reverse engineering to do, both of the firmware itself, and the driver in which the bugs were ultimately found.

Although it had been available since the first day, we did not actually get a full exploit until near the start of the third day (which resulted in at least one of our members not sleeping the entire competition). The full exploit was quite intricate, but worked as such:

Initially, an out of bounds write in the instructions that let the program read from memory or a message passing queue allowed a small amount of arbitrary code to be staged into the running process. From there, an out of bounds read and write in the register page let us write a much larger amount of custom firmware code. That code then abused the aforementioned out of bounds access in memory to construct a ROP chain, which it then convinced one of the host threads to jump to.

However, this was made even more difficult by the degree of concurrency used by the driver. Not only did the firmware run on 4 virtual cores, but the interpreter itself had 8 separate threads. In order to compensate for that, we structured our exploit such that three of the virtual cores would get stuck in an infinite loop and not interfere with the primary exploit, which would then overwrite the stack of just one thread to have it execute the ROP chain. To complete the exploit, the ROP chain would write the flag into memory at a predictable location, where the one remaining firmware core would read it and return it to the virtual machine.

Day 3

show-your-shell

5d shellcoding

The night before the third day, we received the source code for the final KotH problem of the competition. Called “show-your-shell”, it was another of the competitive shellcoding problems that are so well suited to the KotH style of challenge. In this problem, teams could upload a shellcode in any of x86, arm, or riscv as long as

  1. When executed, it read ./secret and printed the result exactly to stdout.
  2. It contained none of the banned characters (more on this later)
  3. It was either shorter than the previous shellcode, or did not include one of the characters that the previous shellcode used.

The banned characters are relevant because they would be updated after each change in leader to include all of the characters the previous leader used that the current leader did not. This is a bit of a confusing rule, because we initially read it as requiring all subsequent shellcodes to be subsets of the previous one, but in actuality it means that once a character stops being used, it can never be used again.

This would eventually result in a growing list of banned characters such that no improvement could be made. Once that happened, the leader would earn points for 15 minutes and the game would reset.

Initially, we began writing a series of shellcodes that would be pretty short and might help us avoid certain banned bytes. At that point, however, one member of the team noticed that the runner had bound its STDOUT to the shellcode’s STDERR. This was relevant because the runner was using xinetd which meant that STDOUT was both readable and writable. Consequently, a valid solution would be to upload a stager that read from STDERR and then uploaded fresh shellcode that was not checked against the byte limit.

Furthermore, in RISCV the setup was so perfect that the only thing needed was to call the read syscall with argument 2. Since that argument was the result of the previous read syscall, if you read two bytes, it would be correct. Furthermore, since the memory was padded with 00s, and the necessary syscall instruction was 73000000, he realized that 7300 was all he needed to write a stager.

Off to the side, a separate observation was noticed. The runner used a synchronized history file to determine the state of the game. It would open this file on connection, and then update it if you succeeded. This flagrant time-of-check-to-time-of-use (TOCTOU) bug had a pretty clear use: As long as you maintained enough connections open, you could “rewind time” 30 seconds into the past.

Using this bug, we wrote a simple script that would constantly keep connections open, and then monitor the leaderboard for new submissions. If it saw one, it would rip it for itself, rewind time, then upload that new submission as its own.

When it came time to deploy this strategy, we found that we were not the only ones to utilize it. This led to the rather amusing situation wherein someone would upload a new shellcode and then three or four teams would engage in a cycle of theft for the thirty seconds that the connections could stay alive. It also meant that several histories could be in play at once, and the game would effectively swap between several “universes.”

In the end, our strategy was a combination of improving our tools to more effectively steal exploits off the wire, and then preparing our own exploits that were non-trivial to reflect. This worked well, and we seemed to get a decent number of points.

ooows-hyper-o

ooows-hyper-o, the final challenge in this virtualization marathon, was also the most distinct. Instead of just being another driver attached to a virtual machine running in KVM, it presented an entirely custom hypervisor, hyper-o.

This problem was also deployed in two separate manners. The first release was just a barebones version of the hypervisor running under Qemu, while the second used the same framework as the other four, but with an update to the virtual machine monitor to use the hyper-o stack instead of kvm.

Much like with OGX, the bug was relatively straightforward, but the discovery and full-chain exploit took a while to develop. The bug lay in the fact that the vmexit handler (which is used to trap privileged instructions in the guest vm), assumes that there are 512 page tables, which are then mapped into the extended page table directory. However, the underlying structure supporting this only had 128 page tables, followed by structures that stored the states of the virtual CPUs. Using this, we could achieve arbitrary memory read and write in the host.

Then, since this was running inside the kernel, the exploit would first scan memory for the kernel base. Then it would patch the code of sys_ioctl to disable the write-protect bit of the control register cr0 and write shellcode at the return address of the active syscall. From there, the userland shellcode could complete the job of making the flag publicly accessible.

Day NaN

Cooorling

Since all of the problems were leaked at the beginning of the competition, we also know that there was an unreleased problem called cooorling. Although not much time was spent on it (due to some outstanding bugs, and the diminishing likelihood that it would be released), it looked like the most unique problem of the lot.

The general idea is that teams engage in “competitive fuzzing.” A binary would be provided (not included in the initial leak), and players would alternate turns providing inputs to it, with the intent of crashing it. For every unique crash produced, that player would earn a “stone” on a theoretical curling field. Players can knock each others’ stones off the field by providing a shorter trace that crashes in the same location. At the end of the round, the player with the most stones wins, and the game is reset (potentially with a new binary).

Although from a meta-gameplay perspective it made sense to not include it, this problem looked like it would have been a lot of fun, so it is a bit disappointing that we could not try it.

cooorling

Conclusion

Legacy

Now that the Order has officially stepped down, DEF CON will be opening applications for a new team to take up their mantle. As part of this application, candidates will be asked to provide an outline for the game as they envision it. If history is any measure, this proposal will be a blend of the old & new. With that in mind, I would like to conclude this retrospective with a brief look at the legacy left by the Order of the Overflow, and where the game could go from here.

King of the Hill

When the Order inherited DEF CON CTF from LegitBS, the competition was overwhelmingly a test of teams’ binary exploitation skills. Although there was plenty of variety to keep the game interesting, most of the problems were pigeonholed into the “pwnable” category. To a large degree, this is an artifact of the Attack/Defense gameplay style. To compromise a binary is (in most cases) to acquire some form of unexpected access. Whether full RCE, or even just arbitrary file read, this access can easily be used within the game to acquire a flag for submission. In contrast with this, problems such as web applications tend to offer wildly different attack surfaces, such as a client browser. Consequently, binary exploitation is the easiest choice for an attack/defense competition.

The OOO offered an alternative to this issue, in the form of King of the Hill. KotH provides a gameplay style that is less focused on exploitation, and more focused on competition. As a result, any skill that could be quantified could then be used as a problem. This offered a refreshing amount of color to an otherwise rote style of CTF, and made the game far more enjoyable for many people.

However, KotH has its limitations as well. For one, it can be too easy at times for one team to get an early lead and run away with the game. In the case of this year’s CTF, this happened to our advantage in both Zero-is-You and WWW. Once we had established a comfortable lead, we were able to continue playing in a far more relaxed manner, comfortable in the knowledge that it would be nearly impossible for other teams to catch us. Having been in the reverse position in years past, I recognize just how frustrating this can be. Instead of it being a pure measure of skill, at times it can just be a question of “who had good luck initially.” For King of the Hill to become a fixture in the game, this is important to keep in mind.

With that said, I do hope KotH does become a fixture. Not only does it make the game more multidimensional, it also adds a levity to it that can offset the oppressive stress inherent to an attack/defense CTF. However, there are other alternatives as well. One potential option is to redefine what it means to capture a flag such that it is more feasible to include other types of challenges. For instance, if infrastructure could be integrated with the problem state, a flag could be “captured” by writing a team secret to a well-known file or by having the victim’s browser make a request to a “flag capture” URL.

Defense

Defense remains one of the most nebulous components of an Attack/Defense CTF. If too permissive, it can shut down attacks before they even begin. If too restrictive, then defense becomes an irrelevant component of the game.

One of the choices that OOO made when establishing their rules for the game was to forego real-time liveness checks of patches in favor of upload-time functionality checks. This has both pros and cons. The major benefit is that patches are much less susceptible to transient failures. Thus, if the game network suffers an outage for a round, then teams would not lose points from it. On the other hand, it means that service availability checks are static, and patches can be overfit to the Service-Level Agreement (SLA) checker. In a perfect world, I think that a mixture of both would be ideal, but that of course adds even further work for the organizers.

Additionally, OOO introduced byte-level restrictions for patching which helped prevent mindless patches. This helped with the defense balance in many regards, but was sometimes overly restrictive. In solving OGX, we ended up using an unpatchable vulnerability. At that point defense was rendered unnecessary as no allowed patch could have prevented it. Nonetheless, this approach is one worth considering for future organizers, as it requires a deeper understanding of the problem in order to patch.

Interestingly enough, in the problem WWW, OOO provided a glimpse at an alternative approach to defense points. In that problem, if you reported attacks against your team from what you found in the network data, you could remove points from your opponent. If instead, reporting flags stolen from your team earned you defense points, then this could switch defense from being patching oriented to analysis oriented. I am unsure as to whether teams would actually enjoy this, but it could be a good way to make defense more relevant (instead of just an extension of bug hunting).

Infrastructure

The game’s infrastructure is one of its most important pieces. Having not looked to deeply into how OOO’s infrastructure worked, I am not particularly well-qualified to discuss it (although I would love to see an analysis from their team!). Yet, there are a couple of choices they made that I would be thrilled to see moving forward.

The first of these is the decision to run one instance of a service per team-team pair. In other words, the service that would be used when team 3 attacked team 15 would not be used by any other attack. This lead to far greater stability, and removed the ability for teams to DOS one another.

The other decision made in recent years (admittedly a consequence of Covid, and less an intentional game decision), was to host the infrastructure in the cloud. Network problems are endemic to DEF CON, and even in a house a state away we were not exempt from periodic outage. Fortunately, because both the game and our exploits were hosted in the cloud, us losing internet or the organizers losing internet had comparatively little effect on the game. This additional stability was a major relief, and helped reduce some of our stress.

Stress

On the subject of stress, I think it is worth discussing some of the ways in which the Order has helped to alleviate it. The most important of these is good communication. This year especially, I was really pleased with the amount of communication the OOO offered to teams. There were frequent updates, notes, and comments about problems, but none of them were full hints. Instead, they served mostly to avoid misunderstandings and prevent wasting time. Combined with the transparency the Order has always offered, they served as a model of organizers who could be trusted and relied upon.

Another substantial source of stress (physical as much as mental), is the frequent lack of sleep during the competition. As a result of problems remaining open overnight and new problems being released for overnight study, many competitors get only a handful of hours a night while the competition is ongoing. I recognize that this is a highly controversial opinion, but I would love to see an attempt to make DEF CON CTF a competition that you only play during the day.

Taken as it is right now, I don’t think that the competition would support this style of gameplay effectively. Instead, intentional problem design would be needed not only to include a better range of bugs (shallow vs. deep), but a way to score differently based off of these bugs. The latter of these serving to incentivize finding deeper bugs. For instance, even if you can only earn points from it for two hours, if a deep bug is worth three times as many points as a shallow bug then it would still be worthwhile to exploit.

Experimentation

The final legacy of OOO I want to mention is their dedication to experimentation. Admittedly, this is a tricky one to discuss because I both appreciate it and find myself wary of it. Many of the details mentioned above are the result of experimentation from the Order. However, some of their other experiments where not always as successful.

Ultimately, what DEF CON CTF “should be” is really a question only the organizers get to answer. For some, it is an opportunity to find the best hackers in the world. For others, it is a chance to push the boundaries of the community and stretch it to its limits. For myself, DEF CON CTF is something more of climax — the event toward which the rest of the season points. The degree to which experimentation is relevant in this competition then depends largely on how you answer that question. I prefer it in moderation; a little experimentation keeps the game fresh, but I generally prefer stability to ambition in my problem design. Said differently, my hope for this CTF is that it reflects the best things that are happening in the community at the moment more than I want it to shape the community.

Final Thoughts

The Order of the Overflow has put an unimaginable amount of work into running DEF CON CTF over the past four years. The whole community owes them our appreciation for the sacrifices that they have made, and dedication that they have shown. They have also left an indelible mark on the competition, and one that will hopefully lay a strong foundation for future iterations.

As the CTF community continues to grow and mature, I suspect that DEF CON CTF will only further as a focal point for the players. As such, I hope that we as a community can encourage and support the next organizers who take on this mantle.

I would like to offer a final congratulations to the Order of the Overflow on a successful fourth DEF CON, and to celebrate Katzebin, Tea Deliverers, and all of the other teams who played incredibly well again this year. Great job everyone, and I am excited to see what next year brings!