Process Snapshots

PyDbg comes stocked with a very cool feature called process snapshotting. Using process snapshotting you are able to freeze a process, obtain all of its memory, and resume the process. At any later point you can revert the process to the point where the snapshot was taken. This can be quite handy when reverse engineering a binary or analyzing a crash.

Obtaining Process Snapshots

Our first step is to get an accurate picture of what the target process was up to at a precise moment. In order for the picture to be accurate, we need to first obtain all threads and their respective CPU contexts. As well, we need to obtain all of the process's memory pages and their contents. Once we have this information, it's just a matter of storing it for when we want to restore a snapshot.

Before we can take the process snapshots, we have to suspend all threads of execution so that they don't change data or state while the snapshot is being taken. To suspend all threads in PyDbg, we use suspend_all_threads(), and to resume all the threads, we use the aptly named resume_all_threads(). Once we have suspended the threads, we simply make a call to process_snapshot(). This automatically fetches all of the contextual information about each thread and all memory at that precise moment. Once the snapshot is finished, we resume all of the threads. When we want to restore the process to the snapshot point, we suspend all of the threads, call process_restore(), and resume all of the threads. Once we resume the process, we should be back at our original snapshot point. Pretty neat, eh?

To try this out, let's use a simple example where we allow a user to hit a key to take a snapshot and hit a key again to restore the snapshot. Open a new Python file, call it snapshot.py, and enter the following code.

snapshot.py

  from pydbg  import *
  from pydbg.defines import *

  import threading
  import time
  import sys

  class snapshotter(object):

      def __init__(self,exe_path):

          self.exe_path     = exe_path
          self.pid          = None
          self.dbg          = None
          self.running      = True

          # Start the debugger thread, and loop until it sets the PID
          # of our target process
          pydbg_thread = threading.Thread(target=self.start_debugger)
          pydbg_thread.setDaemon(0)
          pydbg_thread.start()

          while self.pid == None:
              time.sleep(1)

         # We now have a PID and the target is running; let's get a
          # second thread running to do the snapshots
          monitor_thread = threading.Thread(target=self.monitor_debugger)
          monitor_thread.setDaemon(0)
          monitor_thread.start()

     def monitor_debugger(self):

          while self.running == True:

              input = raw_input("Enter: 'snap','restore' or 'quit'")
              input = input.lower().strip()

              if input == "quit":
                  print "[*] Exiting the snapshotter."
                  self.running = False
                  self.dbg.terminate_process()

              elif input == "snap":

                  print "[*] Suspending all threads."
                  self.dbg.suspend_all_threads()

                  print "[*] Obtaining snapshot."
                  self.dbg.process_snapshot()

                  print "[*] Resuming operation."
                  self.dbg.resume_all_threads()

              elif input == "restore":

                  print "[*] Suspending all threads."
                  self.dbg.suspend_all_threads()

                  print "[*] Restoring snapshot."
                  self.dbg.process_restore()

                  print "[*] Resuming operation."
                  self.dbg.resume_all_threads()

     def start_debugger(self):

          self.dbg = pydbg()
          pid = self.dbg.load(self.exe_path)
          self.pid = self.dbg.pid

          self.dbg.run()

 exe_path = "C:\\WINDOWS\\System32\\calc.exe"

  snapshotter(exe_path)

So the first step snapshot.py is to start the target application under a debugger thread. By using separate threads, we can enter snapshotting commands without forcing the target application to pause while it waits for our input. Once the debugger thread has returned a valid PID snapshot.py , we start up a new thread that will take our input . Then when we send it a command, it will evaluate whether we are taking a snapshot, restoring a snapshot, or quitting —pretty straightforward. The reason I picked Calculator as an example application snapshot.py is that we can actually see this snapshotting process in action. Enter a bunch of random math operations into the calculator, enter snap into our Python script, and then do some more math or hit the Clear button. Then simply type restore into our Python script, and you should see the numbers revert to our original snapshot point! Using this technique you can walk through and rewind certain parts of a process that are of interest without having to restart the process and get it to that exact state again. Now let's combine some of our new PyDbg techniques to create a fuzzing assistance tool that will help us find vulnerabilities in software applications and automate crash handling.

Putting It All Together

Now that we have covered some of the most useful features of PyDbg, we will build a utility program to help root out (pun intended) exploitable flaws in software applications. Certain function calls are more prone to buffer overflows, format string vulnerabilities, and memory corruption. We want to pay particular attention to these dangerous functions.

The tool will locate the dangerous function calls and track hits to those functions. When a function that we deemed to be dangerous gets called, we will dereference four parameters off the stack (as well as the return address of the caller) and snapshot the process in case that function causes an overflow condition. If there is an access violation, our script will rewind the process to the last dangerous function hit. From there it single-steps the target application and disassembles each instruction until we either throw the access violation again or hit the maximum number of instructions we want to inspect. Anytime you see a hit on a dangerous function that matches data you have sent to the application, it is worth taking a look at whether you can manipulate the data to crash the application. This is the first step toward creating an exploit.

Warm up your coding fingers, open a new Python script called danger_track.py, and enter the following code.

danger_track.py

from pydbg import *
from pydbg.defines import *

import utils

# This is the maximum number of instructions we will log
# after an access violation
MAX_INSTRUCTIONS = 10

# This is far from an exhaustive list; add more for bonus points
dangerous_functions = {
                        "strcpy"  :  "msvcrt.dll",
                        "strncpy" :  "msvcrt.dll",
                        "sprintf" :  "msvcrt.dll",
                        "vsprintf":  "msvcrt.dll"
                       }

dangerous_functions_resolved = {}
crash_encountered            = False
instruction_count            = 0

def danger_handler(dbg):

    # We want to print out the contents of the stack; that's about it
    # Generally there are only going to be a few parameters, so we will
    # take everything from ESP to ESP+20, which should give us enough
    # information to determine if we own any of the data
    esp_offset = 0
    print "[*] Hit %s" % dangerous_functions_resolved[dbg.context.Eip]
    print "================================================================="

    while esp_offset <= 20:
        parameter = dbg.smart_dereference(dbg.context.Esp + esp_offset)
        print "[ESP + %d] => %s" % (esp_offset, parameter)
        esp_offset += 4

     print "=================================================================\n"

    dbg.suspend_all_threads()
    dbg.process_snapshot()
    dbg.resume_all_threads()

    return DBG_CONTINUE

def access_violation_handler(dbg):
    global crash_encountered

    # Something bad happened, which means something good happened :)
    # Let's handle the access violation and then restore the process
    # back to the last dangerous function that was called

    if dbg.dbg.u.Exception.dwFirstChance:
            return DBG_EXCEPTION_NOT_HANDLED

    crash_bin = utils.crash_binning.crash_binning()
    crash_bin.record_crash(dbg)
    print crash_bin.crash_synopsis()

    if crash_encountered == False:
        dbg.suspend_all_threads()
        dbg.process_restore()
        crash_encountered = True

        # We flag each thread to single step
        for thread_id in dbg.enumerate_threads():

               print "[*] Setting single step for thread: 0x%08x" % thread_id
            h_thread = dbg.open_thread(thread_id)
            dbg.single_step(True, h_thread)
            dbg.close_handle(h_thread)

        # Now resume execution, which will pass control to our
        # single step handler
        dbg.resume_all_threads()

        return DBG_CONTINUE
    else:
        dbg.terminate_process()

    return DBG_EXCEPTION_NOT_HANDLED

def single_step_handler(dbg):
    global instruction_count
    global crash_encountered

    if crash_encountered:

        if instruction_count == MAX_INSTRUCTIONS:

            dbg.single_step(False)
            return DBG_CONTINUE
        else:

            # Disassemble this instruction
            instruction = dbg.disasm(dbg.context.Eip)
               print "#%d\t0x%08x : %s" % (instruction_count,dbg.context.Eip,
                instruction)
            instruction_count += 1
            dbg.single_step(True)

    return DBG_CONTINUE

dbg = pydbg()

pid = int(raw_input("Enter the PID you wish to monitor: "))
dbg.attach(pid)

# Track down all of the dangerous functions and set breakpoints
for func in dangerous_functions.keys():

    func_address = dbg.func_resolve( dangerous_functions[func],func )
     print "[*] Resolved breakpoint: %s -> 0x%08x" % ( func, func_address )
    dbg.bp_set( func_address, handler = danger_handler )
    dangerous_functions_resolved[func_address] = func

dbg.set_callback( EXCEPTION_ACCESS_VIOLATION, access_violation_handler )
dbg.set_callback( EXCEPTION_SINGLE_STEP, single_step_handler )
dbg.run()

There should be no big surprises in the preceding code block, as we have covered most of the concepts in our previous PyDbg endeavors. The best way to test the effectiveness of this script is to pick a software application that is known to have a vulnerability,^[26] attach the script, and then send the required input to crash the application.

We have taken a solid tour of PyDbg and a subset of the features it provides. As you can see, the ability to script a debugger is extremely powerful and lends itself well to automation tasks. The only downside to this method is that for every piece of information you wish to obtain, you have to write code to do it. This is where our next tool, Immunity Debugger, bridges the gap between a scripted debugger and a graphical debugger you can interact with. Let's carry on.

^[26] A classic stack-based overflow can be found in WarFTPD 1.65. You can still download this FTP server from http://support.jgaa.com/index.php?cmd=DownloadVersion&ID=1.