PlaidCTF 2021: Plaid Party Planning IV

May 4th, 2021

This year’s edition of the Plaid Party Planning saga explores a type confusion in OCaml. The bug results from two OCaml performance optimizations interacting poorly.

Optimization #1: Float unboxing. When it can, the OCaml compiler will convert floats from their default representation as heap-allocated values into 64-bit values in xmm registers or on the stack. This optimization is fairly standard and saves a lot of indirections when working with floats.

Optimiziation #2: Zero-alloc option. The Core_kernel.Uopt module provides a zero-alloc version of 'a option that claims that 'a Uopt.t = 'a. This lie enables the compiler to elide expensive caml_modify operations when using Uopt on immediate (as opposed to heap-allocated) values. Uopt uses a string constant to represent the value Uopt.none. Because string constants are statically allocated, they never move in memory and never need to be garbage collected, so there’s no need to caml_modify them.

If you combine these two optimizations by using a float Uopt.t, you can probably expect some fireworks. The compiler sees the value Uopt.none : float Uopt.t as Uopt.none : float because of the lying type equality and decides to apply float unboxing to it. Unfortunately, Uopt.none is actually a string. Instead of unboxing a boxed float, we’re reinterpreting the first 8 characters of Uopt.none as a 64-bit float. This unboxed “float” is no longer physically equal to Uopt.none, so any operations checking for none will fail.

The challenge

Let’s take a look at how this all comes together in Plaid Party Planning IV (PPPIV). Your task in PPPIV is to divide the Plaid party participants up into several breakout rooms and provide them entertainment for the remote afterparty. Each participant has a list of game preferences and each game had a list of attributes. Based on these preferences and attributes, the PPPIV web planner evaluates the assignment to give you a score. Naturally, you’d like to make the score as high as possible.

If you manage to make the score higher than 1e20 the server will send you the flag.

the PPPIV interface

Exploiting

A back-of-the-envelope calculation indicates that the evaluation function can only produce scores of at most 1e7, far short of the needed bound.

Instead, we can look to some float Uopt.t refs hanging around.

ocaml

content_copy

let evaluate_room ~num_people (room : Room.Full.t) =
  match Float.O.(num_people <= 0.) with
  | true -> 0.
  | false ->
    let result = ref Uopt.none in
    (match Map.data room.games with
    | [] -> (* talking is worth ... less *) result := Uopt.some 1.
    | [ g1 ] -> update_if_useful result (evaluate_game ~num_people g1 room.people)
...

Here, result is a float Uopt.t ref. The documentation for Uopt notes that

It should also not be used in a record that contains only floats and float Uopt.ts, because the compiler would treat that as a float-only record and would unbox the record fields.

This description is similar but isn’t quite what’s happening here. In this case, we hit the optimization where refs are stack allocated. The compiler then sees a stack-allocated float and unboxes it. Even though a float ref is a record with a single float field, without the stack allocation, the polymorphism in the 'a ref would prevent the float from being unboxed.

To exploit this bug, we need to somehow keep that unboxed none in the result after processing all the games. update_if_useful provides us a way

ocaml

content_copy

let[@inline] update_if_useful result value =
  if not Float.O.(value < 1.337) then result := Uopt.some value
;;

If we produce a game so bad that evaluate_game lands under 1.337 then we will keep the unboxed none. The final evaluation removes any score outliers before returning, so we need to trigger the bug in 3 out of the 4 rooms.

Now, we need to figure out a way to produce a horrible game. If we fix the number of people in the room, the score on a game is a monotonic function of the sum of the scores of the individual people. It suffices to greedily add people in order of increasing score to find the worst possible people for that game and room size.

Because the total problem size is small, we can iterate over all the games and all the room sizes and compute all the worst possible games.

Writing a little solver gives us 3 mutually-acceptable games that satisfying the requirements:

luke, b2xiao, and kylar playing War
zaratec playing Person do Thing by herself
ubuntor playing Among Us by himself