Document Rendering Helpers

A significant portion of the plug-in landscape belongs to programs that allow certain very traditional, “nonweb” document formats to be shown directly in the browser. Some of these programs are genuinely useful: Windows Media Player, RealNetworks RealPlayer, and Apple QuickTime have been the backbone of online multimedia playback for about a decade, at least until their displacement by Adobe Flash. The merits of others are more questionable, however. For example, Adobe Reader and Microsoft Office both install in-browser document viewers, increasing the user’s attack surface appreciably, though it is unclear whether these viewers offer a real benefit over opening the same document in a separate application with one extra click.

Of course, in a perfect world, hosting or embedding a PDF or a Word document should have no direct consequences for the security of the participating websites. Yet, predictably, the reality begs to differ. In 2009, a researcher noted that PDF-based forms that submit to javascript: URLs can apparently lead to client-side code execution on the embedding site.^[163] Perhaps even more troubling than this report alone, according to that researcher’s account, Adobe initially dismissed the report with the following note: “Our position is that, like an HTML page, a PDF file is active content.”

It is regrettable that the hosting party does not have full control of when this active content is detected and executed and that otherwise reasonable webmasters may think of PDFs or Word documents as just a fancy way to present text. In reality, despite their harmless appearance, in a bid to look cool, many such document formats come equipped with their own hyperlinking capabilities or even scripting languages. For example, JavaScript code can be embedded in PDF documents, and Visual Basic macros are possible in Microsoft Office files. When a script-bearing document is displayed on an HTML page, some form of a programmatic plug-in-to-browser bridge usually permits a degree of interaction with the embedding site, and the design of such bridges can vary from vaguely questionable to outright preposterous.

In one 2007 case, Petko D. Petkov noticed that a site that hosts any PDF documents can be attacked simply by providing completely arbitrary JavaScript code in the fragment identifier. This string will be executed on the hosting page through the plug-in bridge:^[164]

http://example.com/random_document.pdf#foo=javascript:alert(1)

The two vulnerabilities outlined here are now fixed, but the lesson is that special care should be exercised when hosting or embedding any user-supplied documents in sensitive domains. The consequences of doing so are not well documented and can be difficult to predict.