Hooking is a powerful process-observation technique that is used to change the flow of a process in order to monitor or alter data that is being accessed. Hooking is what enables rootkits to hide themselves, keyloggers to steal keystrokes, and debuggers to debug! A reverse engineer can save many hours of manual debugging by implementing simple hooks to automatically glean the information he is seeking. It is an incredibly simple yet very powerful technique.
On the Windows platform, a myriad of methods are used to implement
hooks. We will be focusing on two primary techniques that I call "soft"
and "hard" hooking. A soft hook is one where you are
attached to the target process and implement INT3
breakpoint handlers to intercept execution flow. This may already sound
like familiar territory for you; that's because you essentially wrote your
own hook in Extending Breakpoint Handlers on printf_random.py. A hard hook is one
where you are hard-coding a jump in the target's assembly to get the hook
code, also written in assembly, to run. Soft hooks are useful for
nonintensive or infrequently called functions. However, in order to hook
frequently called routines and to have the least amount of impact on the
process, you must use hard hooks. Prime candidates for a hard hook are
heap-management routines or intensive file I/O operations.
We will be using previously covered tools in order to apply both hooking techniques. We'll start with using PyDbg to do some soft hooking in order to sniff encrypted network traffic, and then we'll move into hard hooking with Immunity Debugger to do some high-performance heap instrumentation.
The first example we will explore involves sniffing encrypted traffic at the application layer. Normally to understand how a client or server application interacts with the network, we would use a traffic analyzer like Wireshark.[33] Unfortunately, Wireshark is limited in that it can only see the data post encryption, which obfuscates the true nature of the protocol we are studying. Using a soft hooking technique, we can trap the data before it is encrypted and trap it again after it has been received and decrypted.
Our target application will be the popular open-source web browser Mozilla Firefox.[34] For this exercise we are going to pretend that Firefox is closed source (otherwise it wouldn't be much fun now, would it?) and that it is our job to sniff data out of the firefox.exe process before it is encrypted and sent to a server. The most common form of encryption that Firefox performs is Secure Sockets Layer (SSL) encryption, so we'll choose that as the main target for our exercise.
In order to track down the call or calls that are responsible for
passing around the unencrypted data, you can use the technique for
logging intermodular calls as described at http://forum.immunityinc.com/index.php?topic=35.0/. There
is no "right" spot to place your hook; it is really just a matter of
preference. Just so that we are on the same page, we'll assume that the
hook point is on the function PR_Write
, which is
exported from nspr4.dll. When this function is hit,
there is a pointer to an ASCII character array located at [ ESP
+ 8 ]
that contains the data we are submitting before it has
been encrypted. That +8
offset from
ESP
tells us that it is the second parameter passed
to the PR_Write
function that we are interested in.
It is here that we will trap the ASCII data, log it, and continue the
process.
First let's verify that we can actually see the data we are
interested in. Open the Firefox web browser, and navigate to one of my
favorite sites, https://www.openrce.org/. Once you
have accepted the site's SSL certificate and the page has loaded, attach
Immunity Debugger to the firefox.exe process and
set a breakpoint on nspr4.PR_Write
. In the top-right
corner of the OpenRCE website is a login form; set a username to
test
and a password to
test
and click the Login button. The breakpoint you set should be
hit almost immediately; keep pressing F9 and you'll continually see the
breakpoint being hit. Eventually, you will see a string pointer on the
stack that dereferences to something like this:
[ESP + 8] => ASCII "username=test&password=test&remember_me=on"
Sweet! We can see the username and password quite clearly, but if you were to watch this transaction take place from a network level, all of the data would be unintelligible because of the strong SSL encryption. This technique will work for more than the OpenRCE site; for example, to give yourself a good scare, browse to a more sensitive site and see how easy it is to observe the unencrypted information flow to the server. Now let's automate this process so that we can just capture the pertinent information and not have to manually control the debugger.
To define a soft hook with PyDbg, you first define a hook container that will hold all of your hook objects. To initialize the container, use this command:
hooks = utils.hook_container()
To define a hook and add it to the container, you use the
add()
method from the
hook_container
class to add your hook points. The
function prototype looks like this:
add( pydbg, address, num_arguments, func_entry_hook, func_exit_hook )
The first parameter is simply a valid pydbg
object, the address
parameter is the address on which
you would like to install the hook, and num_arguments
tells the hook function how many parameters the target function takes.
The func_entry_hook
and
func_exit_hook
functions are callback functions that
define the code that will run when the hook is hit (entry) and
immediately after the hooked function is finished (exit). The entry
hooks are useful to see what parameters get passed to a function,
whereas the exit hooks are useful for trapping function return
values.
Your entry hook callback function must have a prototype like this:
def entry_hook( dbg, args ): # Hook code here return DBG_CONTINUE
The dbg
parameter is the valid
pydbg
object that was used to set the hook. The
args
parameter is a zero-based list of the parameters
that were trapped when the hook was hit.
The prototype of an exit hook callback function is slightly
different in that it also has a ret
parameter, which
is the return value of the function (the value of
EAX
):
def exit_hook( dbg, args, ret ): # Hook code here return DBG_CONTINUE
To illustrate how to use an entry hook callback to sniff pre-encrypted traffic, open up a new Python file, name it firefox_hook.py, and punch out the following code.
from pydbg import * from pydbg.defines import * import utils import sys dbg = pydbg() found_firefox = False # Let's set a global pattern that we can make the hook # search for pattern = "password" # This is our entry hook callback function # the argument we are interested in is args[1] def ssl_sniff( dbg, args ): # Now we read out the memory pointed to by the second argument # it is stored as an ASCII string, so we'll loop on a read until # we reach a NULL byte buffer = "" offset = 0 while 1: byte = dbg.read_process_memory( args[1] + offset, 1 ) if byte != "\x00": buffer += byte offset += 1 continue else: break if pattern in buffer: print "Pre-Encrypted: %s" % buffer return DBG_CONTINUE # Quick and dirty process enumeration to find firefox.exe for (pid, name) in dbg.enumerate_processes(): if name.lower() == "firefox.exe": found_firefox = True hooks = utils.hook_container() dbg.attach(pid) print "[*] Attaching to firefox.exe with PID: %d" % pid # Resolve the function address hook_address = dbg.func_resolve_debuggee("nspr4.dll","PR_Write") if hook_address: # Add the hook to the container. We aren't interested # in using an exit callback, so we set it to None. hooks.add( dbg, hook_address, 2, ssl_sniff, None ) print "[*] nspr4.PR_Write hooked at: 0x%08x" % hook_address break else: print "[*] Error: Couldn't resolve hook address." sys.exit(-1) if found_firefox: print "[*] Hooks set, continuing process." dbg.run() else: print "[*] Error: Couldn't find the firefox.exe process." sys.exit(-1)
The code is fairly straightforward: It sets a hook on
PR_Write
, and when the hook gets hit, we attempt
to read out an ASCII string pointed to by the second parameter. If
it matches our search pattern, we output it to the console. Start up
a fresh instance of Firefox and run
firefox_hook.py from the command line. Retrace
your steps and do the login submission on https://www.openrce.org/, and you should see output
similar to that in Example 6-1.
Example 6-1. How cool is that! We can clearly see the username and password before they are encrypted.
[*] Attaching to firefox.exe with PID: 1344 [*] nspr4.PR_Write hooked at: 0x601a2760 [*] Hooks set, continuing process. Pre-Encrypted: username=test&password=test&remember_me=on Pre-Encrypted: username=test&password=test&remember_me=on Pre-Encrypted: username=jms&password=yeahright!&remember_me=on
We have just demonstrated how soft hooks are both lightweight and powerful. This
technique can be applied to all kinds of debugging or reversing
scenarios. This particular scenario was well suited for the soft
hooking technique, but if we were to apply it to a
more performance-bound function call, very quickly we would see the
process slow to a crawl and begin to exhibit wacky behavior and
possibly even crash. This is simply because the
INT3
instruction causes handlers to be called,
which then lead to our own hook code being executed and control
being returned. That's a lot of work if this needs to happen
thousands of times per second! Let's see how we can work around this
limitation by applying a hard hook to instrument low-level heap
routines. Onward!
[33] See http://www.wireshark.org/.
[34] For the Firefox download, go to http://www.mozilla.com/en-US/.