Understanding JavaScript syntax is the key to good obfuscation. The loosely typed nature of the language makes much strange looking code syntax work that, at first glance, should not work. In this section, we discuss some basic JavaScript concepts that we will use throughout this chapter. Hopefully, if you are new to JavaScript, you will find this introduction helpful and easy to understand, and you will open your mind to the possibility of abusing other languages in ways that are legal syntax but result in unintended consequences.
JavaScript background
Simple yet powerful, sometimes confusing but eventually logical: There is no better way to describe the JavaScript parser. Once you understand the parser, you will be able to understand how to use the code to your advantage.
The examples in this chapter show you how to change the value alert(1) to a different representation, yet have it execute the same code. In case you are not familiar with alert, here is a simple explanation. The window object in JavaScript is the container of all global variables. You can have window objects in different locations in your code, and therefore separate global objects. When executing functions or reading values JavaScript automatically assumes the window object is the current object and all variables are global, unless a local variable is declared. If you are used to other programming languages, you may find this concept confusing; it helps to just be aware that JavaScript has global variable reliance at its core.
When we call
alert we are using the
window object's
alert method. You can see this by running the following code in a browser of your choice:
<script type=“text/javascript”>
window.alert(1); window.alert(window.alert);
As you can see, the alert box appears twice with the same value,
1. The last box shows you that
alert is a native function of the browser. This means it's already defined before you enter any code. Let us see what happens when we define our own function called
alert:
<script type=“text/javascript”>
Here, we simply defined our own function called alert, with no arguments between the parentheses. The curly braces indicate the body of the function. In this case, our function does nothing. We get no alert from the browser, and we have successfully overwritten the native method of the window object. Although this will not help you with obfuscation, it should help you to understand how the code can be manipulated.
Something that will help you with obfuscation is the square bracket syntax of JavaScript. This is one of the most-used parts of the language and it shares the syntax with array literals. An array literal consists of a starting square bracket (
[) and an ending square bracket (
]). The values between the brackets can be any JavaScript object and are separated by commas. They can also be deeply nested to form multidimensional arrays. Let us make an array literal with some values in it. Before running the following example, try to guess the value returned by JavaScript.
<script type=“text/javascript”>
If you guessed /a/, you are correct. JavaScript arrays are indexed from zero. First we assigned the array to x, and then we added a list of JavaScript objects, separating them with commas. Next, we executed alert, which returns the fourth element of the array. Notice the difference between the square bracket syntax when accessing an object and declaring a literal.
Now things will get slightly more complicated and interesting. Take a look at the next example, which shows how the object property is accessed:
<script type=“text/javascript”>
objLiteral={‘objProperty’:123};
alert(objLiteral[0,1,2,3,’objProperty’]);
In the preceding code, the curly braces declare an object literal. The ‘objProperty’ string is the name of the object's property, and the value 123 is assigned to it. We access the object literal using the square brackets. Notice how the square brackets look like an array, but in fact are accessing an object property. This is important syntax to understand, as these core techniques can enable powerful obfuscation. In this instance, the rightmost statement is returned to access the property (i.e., the last comma of the statement inside the square bracket notation).
Now we will look at a slightly different way of doing the same thing, this time enclosing the contents with parentheses. This enables you to group statements, and return the last statement within another statement. The following example shows two groups of parentheses. The first group returns the next group and the last group returns the string
‘objProperty’ because this is the last statement of that group.
<script type=“text/javascript”>
objLiteral={‘objProperty’:123};
alert(objLiteral[(0,1,2,3,(0,’objProperty’))]);
The next step of the JavaScript learning process is to understand how strings are created. Strings are the basis of obfuscation, as without them, we cannot create our code. JavaScript supports many more ways to create strings than you may think. For instance, you can use the normal methods that JavaScript provides, such as the new
String(‘I am a string’) and the standard
“I am a string” and
‘I am a string.’ Although the
new String constructor is less convenient than the standard syntax, and therefore is rarely used, in your quest for obfuscated code it helps to know the various ways to create a string. Let us look deeper into strings and see other ways we can create them.
<script type=“text/javascript”>
alert(/I am a string/+‘’);
alert(/I am a string/.source);
alert(/I am a string/[‘source’]);
alert([‘I am a string’]+[])
In the preceding code, the first alert contains a regular expression, as indicated by the starting forward slash and ending forward slash. JavaScript does type coercion and converts our regular expression into a string when using +. The second example uses the standard source property of the regexp object (every regexp object has a source property), and it returns the text used for the regular expression without the starting and ending forward slashes. Lastly, the array is used as a string because each array has a toString method, and it is called automatically when accessing an array without specifying an element.
There is yet another way to use square bracket notation to access strings. This nonstandard method of using strings—which has been adopted by the major browsers (IE8, Safari, Opera, Firefox, and Chrome)—involves using strings in an array-like fashion: specifying a number will return the various parts of the string, just like an array. This is very useful for obfuscation when combined with various methods of obtaining a string.
If you use string indexes, remember that in IE7 and earlier string indexes are not supported. As a workaround, you can use String.split and convert your string into an array.
<script type=“text/javascript”>
The preceding example returns the letter
a, as this is the first character of the string. This is not a true array, as it still retains the string methods, and you cannot assign to a position of the string.
A little-known fact is that Firefox allows some truly imprudent practices for function names. Not only can they lead to confusion by clashing with statements, but they can also lead to syntax errors and bad programming style. The following example demonstrates this quirky function-naming convention:
<script type=“text/javascript”>
window.function=function function(){return function function(){return function function(){alert(‘Works in Firefox’)}()}()}()