Chapter 6. HOOKING

Hooking is a powerful process-observation technique that is used to change the flow of a process in order to monitor or alter data that is being accessed. Hooking is what enables rootkits to hide themselves, keyloggers to steal keystrokes, and debuggers to debug! A reverse engineer can save many hours of manual debugging by implementing simple hooks to automatically glean the information he is seeking. It is an incredibly simple yet very powerful technique.

On the Windows platform, a myriad of methods are used to implement hooks. We will be focusing on two primary techniques that I call "soft" and "hard" hooking. A soft hook is one where you are attached to the target process and implement INT3 breakpoint handlers to intercept execution flow. This may already sound like familiar territory for you; that's because you essentially wrote your own hook in Extending Breakpoint Handlers on printf_random.py. A hard hook is one where you are hard-coding a jump in the target's assembly to get the hook code, also written in assembly, to run. Soft hooks are useful for nonintensive or infrequently called functions. However, in order to hook frequently called routines and to have the least amount of impact on the process, you must use hard hooks. Prime candidates for a hard hook are heap-management routines or intensive file I/O operations.

We will be using previously covered tools in order to apply both hooking techniques. We'll start with using PyDbg to do some soft hooking in order to sniff encrypted network traffic, and then we'll move into hard hooking with Immunity Debugger to do some high-performance heap instrumentation.

Soft Hooking with PyDbg

The first example we will explore involves sniffing encrypted traffic at the application layer. Normally to understand how a client or server application interacts with the network, we would use a traffic analyzer like Wireshark.^[33] Unfortunately, Wireshark is limited in that it can only see the data post encryption, which obfuscates the true nature of the protocol we are studying. Using a soft hooking technique, we can trap the data before it is encrypted and trap it again after it has been received and decrypted.

Our target application will be the popular open-source web browser Mozilla Firefox.^[34] For this exercise we are going to pretend that Firefox is closed source (otherwise it wouldn't be much fun now, would it?) and that it is our job to sniff data out of the firefox.exe process before it is encrypted and sent to a server. The most common form of encryption that Firefox performs is Secure Sockets Layer (SSL) encryption, so we'll choose that as the main target for our exercise.

In order to track down the call or calls that are responsible for passing around the unencrypted data, you can use the technique for logging intermodular calls as described at http://forum.immunityinc.com/index.php?topic=35.0/. There is no "right" spot to place your hook; it is really just a matter of preference. Just so that we are on the same page, we'll assume that the hook point is on the function PR_Write, which is exported from nspr4.dll. When this function is hit, there is a pointer to an ASCII character array located at [ ESP + 8 ] that contains the data we are submitting before it has been encrypted. That +8 offset from ESP tells us that it is the second parameter passed to the PR_Write function that we are interested in. It is here that we will trap the ASCII data, log it, and continue the process.

First let's verify that we can actually see the data we are interested in. Open the Firefox web browser, and navigate to one of my favorite sites, https://www.openrce.org/. Once you have accepted the site's SSL certificate and the page has loaded, attach Immunity Debugger to the firefox.exe process and set a breakpoint on nspr4.PR_Write. In the top-right corner of the OpenRCE website is a login form; set a username to test and a password to test and click the Login button. The breakpoint you set should be hit almost immediately; keep pressing F9 and you'll continually see the breakpoint being hit. Eventually, you will see a string pointer on the stack that dereferences to something like this:

[ESP + 8] => ASCII "username=test&password=test&remember_me=on"

Sweet! We can see the username and password quite clearly, but if you were to watch this transaction take place from a network level, all of the data would be unintelligible because of the strong SSL encryption. This technique will work for more than the OpenRCE site; for example, to give yourself a good scare, browse to a more sensitive site and see how easy it is to observe the unencrypted information flow to the server. Now let's automate this process so that we can just capture the pertinent information and not have to manually control the debugger.

To define a soft hook with PyDbg, you first define a hook container that will hold all of your hook objects. To initialize the container, use this command:

hooks = utils.hook_container()

To define a hook and add it to the container, you use the add() method from the hook_container class to add your hook points. The function prototype looks like this:

add( pydbg, address, num_arguments, func_entry_hook, func_exit_hook )

The first parameter is simply a valid pydbg object, the address parameter is the address on which you would like to install the hook, and num_arguments tells the hook function how many parameters the target function takes. The func_entry_hook and func_exit_hook functions are callback functions that define the code that will run when the hook is hit (entry) and immediately after the hooked function is finished (exit). The entry hooks are useful to see what parameters get passed to a function, whereas the exit hooks are useful for trapping function return values.

Your entry hook callback function must have a prototype like this:

def entry_hook( dbg, args ):

    # Hook code here

    return DBG_CONTINUE

The dbg parameter is the valid pydbg object that was used to set the hook. The args parameter is a zero-based list of the parameters that were trapped when the hook was hit.

The prototype of an exit hook callback function is slightly different in that it also has a ret parameter, which is the return value of the function (the value of EAX):

def exit_hook( dbg, args, ret ):

    # Hook code here

    return DBG_CONTINUE

To illustrate how to use an entry hook callback to sniff pre-encrypted traffic, open up a new Python file, name it firefox_hook.py, and punch out the following code.

firefox_hook.py

from pydbg import *
from pydbg.defines import *

import utils
import sys

dbg           = pydbg()
found_firefox = False

# Let's set a global pattern that we can make the hook
# search for
pattern       = "password"

# This is our entry hook callback function
# the argument we are interested in is args[1]
def ssl_sniff( dbg, args ):

    # Now we read out the memory pointed to by the second argument
    # it is stored as an ASCII string, so we'll loop on a read until
    # we reach a NULL byte
    buffer  = ""
    offset  = 0

    while 1:
        byte = dbg.read_process_memory( args[1] + offset, 1 )

        if byte != "\x00":
            buffer  += byte
            offset  += 1
            continue
        else:
            break

    if pattern in buffer:
        print "Pre-Encrypted: %s" % buffer

    return DBG_CONTINUE

# Quick and dirty process enumeration to find firefox.exe
for (pid, name) in dbg.enumerate_processes():

    if name.lower() == "firefox.exe":

        found_firefox = True
        hooks         = utils.hook_container()

        dbg.attach(pid)
        print "[*] Attaching to firefox.exe with PID: %d" % pid

        # Resolve the function address
          hook_address  = dbg.func_resolve_debuggee("nspr4.dll","PR_Write")

        if hook_address:
            # Add the hook to the container. We aren't interested
            # in using an exit callback, so we set it to None.
            hooks.add( dbg, hook_address, 2, ssl_sniff, None )
            print "[*] nspr4.PR_Write hooked at: 0x%08x" % hook_address
            break
        else:
            print "[*] Error: Couldn't resolve hook address."
            sys.exit(-1)

if found_firefox:
    print "[*] Hooks set, continuing process."
    dbg.run()
else:
    print "[*] Error: Couldn't find the firefox.exe process."
    sys.exit(-1)

The code is fairly straightforward: It sets a hook on PR_Write, and when the hook gets hit, we attempt to read out an ASCII string pointed to by the second parameter. If it matches our search pattern, we output it to the console. Start up a fresh instance of Firefox and run firefox_hook.py from the command line. Retrace your steps and do the login submission on https://www.openrce.org/, and you should see output similar to that in Example 6-1.

Example 6-1. How cool is that! We can clearly see the username and password before they are encrypted.

[*] Attaching to firefox.exe with PID: 1344
[*] nspr4.PR_Write hooked at: 0x601a2760
[*] Hooks set, continuing process.
Pre-Encrypted: username=test&password=test&remember_me=on
Pre-Encrypted: username=test&password=test&remember_me=on
Pre-Encrypted: username=jms&password=yeahright!&remember_me=on

We have just demonstrated how soft hooks are both lightweight and powerful. This technique can be applied to all kinds of debugging or reversing scenarios. This particular scenario was well suited for the soft hooking technique, but if we were to apply it to a more performance-bound function call, very quickly we would see the process slow to a crawl and begin to exhibit wacky behavior and possibly even crash. This is simply because the INT3 instruction causes handlers to be called, which then lead to our own hook code being executed and control being returned. That's a lot of work if this needs to happen thousands of times per second! Let's see how we can work around this limitation by applying a hard hook to instrument low-level heap routines. Onward!

^[33] See http://www.wireshark.org/.

^[34]For the Firefox download, go to http://www.mozilla.com/en-US/.