How it works...

There are two mains types of XSS, reflected and persistent. Persistent XSS is where an attacker was able to implant a code exploit within a persistent layer of our architecture (for instance, a server-side database, but also caching layers and browser persistent could come under the same banner). Reflected XSS is reliant on a single interaction with a server, such that the content returned by the server contains the code exploit.

In our case, the main problem is a reflected XSS vulnerability.

The way the href attribute of the anchor tag (<a>) is constructed from input parameters allows an attacker to create a URL that can effectively break context (that is, the context of being an HTML attribute), and inject code into the client.

Let's take the parameters in the malicious URL and break them down. First, there's the prev parameter, which is set to %22%3E%3Cscri. For clarity, we can quickly decode the URI encoded elements like so:

$ node -p "decodeURI('%22%3E%3Cscri')"
 "><scri

The anchor element in our original app looks like this:

<a href="${prev}${handoverToken}/${lang}"> Back to Control HQ </a>

If we replace the prev interpolation in place we get:

<a href=""><scri${handoverToken}/${lang}"> Back to Control HQ </a>

So we've been able to close the href attribute and the <a> tag and begin the <script> tag.

We can't just put the entire <script> tag in a single parameter, at least not in Chrome. Chrome has an XSS auditor that is enabled by default.

XSS Auditor
For more on Chromes XSS auditor see the Hardening Headers in Web Frameworks recipe, in particular, the portion regarding the XSS-Protection header.

If we move the pt> characters from the handoverToken parameter into the prev parameter, in Chrome, and open Chrome Devtools we'll see an error message as shown in the following screenshot:

By spreading the <script> tag across two injected parameters, we were able to bypass Chromes XSS auditor (at least at the time of writing, if this no longer works in Chrome at the time of reading, we may be able to run the exploit in another browser, such as Safari, Internet Explorer/Edge, or Firefox).

The handoverToken parameter is: pt%3Estat.innerHTML=%22it%27s%20all%20good...%3Cbr%3Erelax%20:)%22%3C.

Let's decode that:

$ node -p "decodeURI('pt%3Estat.innerHTML=%22it%27s%20all%20good...%3Cbr%3Erelax%20:)%22%3C')"
 pt>stat.innerHTML="it's all good...<br>relax :)"<

Let's replace the interpolated handoverToken in our HTML alongside the replace prev token:

<a href=""><script>stat.innerHTML="it's all good...<br>relax :)"</${lang}"> Back to Control HQ </a>

Now we've been able to complete the <script> tag and insert some JavaScript that will run directly in the browser when the page loads.

The injected code accesses the <div> element with an id attribute of stat and sets the inner HTML to an alternative status. The HTML5 specification indicates that the value of an ID field should become a global variable (see https://html.spec.whatwg.org/#named-access-on-the-window-object). While we could use document.getElementById` we use the shorthand version for our purposes (although as a development practice this is a brittle approach).

Finally, the lang token is script%3E%3Ca%20href=%22.

Let's decode it:

$ node -p "decodeURI('script%3E%3Ca%20href=%22')"
 script><a href="

Now let's insert that into the HTML:

<a href=""><script>stat.innerHTML="it's all good...<br>relax :)"</script><a href=""> Back to Control HQ </a>

Notice how this attack utilized the forward slash (/) in the URL as the forward slash for the closing </script> tag. After closing the script tag, the JavaScript will now run in the browser, but to avoid raising suspicion the attack also creates a new dummy anchor tag to prevent any broken HTML appearing in the page.

We pass the fully assembled contents of the href attribute through the he.encode function. The he.encode function performs HTML Attribute Encoding, whereas the he.escape function (used on the status argument) performs HTML Entity encoding. Since we're placing user input inside an HTML attribute, the safest way to escape the input is by encoding all non-alphanumeric characters as hex value HTML entities. For instance, the double quote becomes ", which prevents it from closing out the attribute.

We also pass the status parameter, which originates from our pretendDbQuery call through the he.escape function. The he.escape function converts HTML syntax into HTML (semantic) entities, for instance the opening tag less than character < becomes <.

All input that isn't generated by our Node process should be treated as user input. We cannot guarantee whether other parts of the system have allowed uncleaned user input into the database, so to avoid persistent XSS attacks we need to clean database input as well.

We pass status through he.escape because it appears in a general HTML context, whereas we pass href through he.encode because it appears in an HTML attribute context.