Now that we have a functional debugging core, it's time to add breakpoints. Using the information from Chapter 2, we will implement soft breakpoints, hardware breakpoints, and memory breakpoints. We will also develop special handlers for each type of breakpoint and show how to cleanly resume the process after a breakpoint has been hit.
In order to place soft breakpoints, we need to be able to read
and write into a process's memory. This is done via the
ReadProcessMemory()
[16] and WriteProcessMemory()
[17] functions. They have similar prototypes:
BOOL WINAPI ReadProcessMemory( HANDLE hProcess, LPCVOID lpBaseAddress, LPVOID lpBuffer, SIZE_T nSize, SIZE_T* lpNumberOfBytesRead ); BOOL WINAPI WriteProcessMemory( HANDLE hProcess, LPCVOID lpBaseAddress, LPCVOID lpBuffer, SIZE_T nSize, SIZE_T* lpNumberOfBytesWritten );
Both of these calls allow the debugger to inspect and alter the
debuggee's memory. The parameters are straightforward;
lpBaseAddress
is the address where you wish to
start reading or writing. The lpBuffer
parameter is
a pointer to the data that you are either reading or writing, and the
nSize
parameter is the total number of bytes you
wish to read or write.
Using these two function calls, we can enable our debugger to use soft breakpoints quite easily. Let's modify our core debugging class to support the setting and handling of soft breakpoints.
... class debugger(): def __init__(self): self.h_process = None self.pid = None self.debugger_active = False self.h_thread = None self.context = None self.breakpoints = {} ... def read_process_memory(self,address,length): data = "" read_buf = create_string_buffer(length) count = c_ulong(0) if not kernel32.ReadProcessMemory(self.h_process, address, read_buf, length, byref(count)): return False else: data += read_buf.raw return data def write_process_memory(self,address,data): count = c_ulong(0) length = len(data) c_data = c_char_p(data[count.value:]) if not kernel32.WriteProcessMemory(self.h_process, address, c_data, length, byref(count)): return False else: return True def bp_set(self,address): if not self.breakpoints.has_key(address): try: # store the original byte original_byte = self.read_process_memory(address, 1) # write the INT3 opcode self.write_process_memory(address, "\xCC") # register the breakpoint in our internal list self.breakpoints[address] = (address, original_byte) except: return False return True
Now that we have support for soft breakpoints, we need to find a good place to put one.
In general, breakpoints are set on a function call of some type;
for the purpose of this exercise we will use our good friend
printf()
as the target function we wish to trap.
The Windows debugging API has given us a very clean method for
determining the virtual address of a function in the form of
GetProcAddress()
,[18] which again is exported from
kernel32.dll. The only primary requirement of
this function is a handle to the module (a .dll
or .exe file) that contains the function we are
interested in; we obtain this handle by using
GetModuleHandle()
.[19] The function prototypes for
GetProcAddress()
and
GetModuleHandle()
look like this:
FARPROC WINAPI GetProcAddress( HMODULE hModule, LPCSTR lpProcName ); HMODULE WINAPI GetModuleHandle( LPCSTR lpModuleName );
This is a pretty straightforward chain of events: We obtain a handle to the module and then search for the address of the exported function we want. Let's add a helper function in our debugger to do just that. Again back to my_debugger.py.
... class debugger(): ... def func_resolve(self,dll,function): handle = kernel32.GetModuleHandleA(dll) address = kernel32.GetProcAddress(handle, function) kernel32.CloseHandle(handle) return address
Now let's create a second test harness that will use
printf()
in a loop. We will resolve the function
address and then set a soft breakpoint on it. After the breakpoint
is hit, we should see some output, and then the process will
continue its loop. Create a new Python script called
printf_loop.py, and punch in the following
code.
from ctypes import * import time msvcrt = cdll.msvcrt counter = 0 while 1: msvcrt.printf("Loop iteration %d!\n" % counter) time.sleep(2) counter += 1
Now let's update our test harness to attach to this process
and to set a breakpoint on printf()
.
import my_debugger debugger = my_debugger.debugger() pid = raw_input("Enter the PID of the process to attach to: ") debugger.attach(int(pid)) printf_address = debugger.func_resolve("msvcrt.dll
","printf
") print "[*] Address of printf: 0x%08x" % printf_address debugger.bp_set(printf_address) debugger.run()
So to test this, fire up printf_loop.py in a command-line console. Take note of the python.exe PID using Windows Task Manager. Now run your my_test.py script, and enter the PID. You should see output shown in Example 3-3.
Example 3-3. Order of events for handling a soft breakpoint
Enter the PID of the process to attach to: 4048
[*] Address of printf: 0x77c4186a
[*] Setting breakpoint at: 0x77c4186a
Event Code: 3 Thread ID: 3148
Event Code: 6 Thread ID: 3148
Event Code: 6 Thread ID: 3148
Event Code: 6 Thread ID: 3148
Event Code: 6 Thread ID: 3148
Event Code: 6 Thread ID: 3148
Event Code: 6 Thread ID: 3148
Event Code: 6 Thread ID: 3148
Event Code: 6 Thread ID: 3148
Event Code: 6 Thread ID: 3148
Event Code: 6 Thread ID: 3148
Event Code: 6 Thread ID: 3148
Event Code: 6 Thread ID: 3148
Event Code: 6 Thread ID: 3148
Event Code: 6 Thread ID: 3148
Event Code: 6 Thread ID: 3148
Event Code: 6 Thread ID: 3148
Event Code: 2 Thread ID: 3620
Event Code: 1 Thread ID: 3620
[*] Exception address: 0x7c901230
[*] Hit the first breakpoint.
Event Code: 4 Thread ID: 3620
Event Code: 1 Thread ID: 3148
[*] Exception address: 0x77c4186a
[*] Hit user defined breakpoint.
We can first see that printf()
resolves to
0x77c4186a
, and so we set our breakpoint on that
address. The first exception that is caught is the Windows-driven
breakpoint, and when the second exception comes along, we see that
the exception address is 0x77c4186a
, the address
of printf()
. After the breakpoint is handled, the
process should resume its loop. Our debugger now supports soft
breakpoints, so let's move on to hardware breakpoints.
The second type of breakpoint is the hardware breakpoint, which involves setting certain bits in the CPU's debug registers. We covered this process extensively in the previous chapter, so let's get to the implementation details. The important thing to remember when managing hardware breakpoints is tracking which of the four available debug registers are free for use and which are already being used. We have to ensure that we are always using a slot that is empty, or we can run into problems where breakpoints aren't being hit where we expect them to.
Let's start by enumerating all of the threads in the process and obtain a CPU context record for each of them. Using the retrieved context record, we then modify one of the registers between DR0 and DR3 (depending on which are free) to contain the desired breakpoint address. We then flip the appropriate bits in the DR7 register to enable the breakpoint and set its type and length.
Once we have created the routine to set the breakpoint, we need
to modify our main debug event loop so that it can appropriately
handle the exception that is thrown by a hardware breakpoint. We know
that a hardware breakpoint triggers an INT1
(or
single-step event), so we simply add another exception handler to our
debug loop. Let's start with setting the breakpoint.
... class debugger(): def __init__(self): self.h_process = None self.pid = None self.debugger_active = False self.h_thread = None self.context = None self.breakpoints = {} self.first_breakpoint= True self.hardware_breakpoints = {} ... def bp_set_hw(self, address, length, condition): # Check for a valid length value if length not in (1, 2, 4): return False else: length -= 1 # Check for a valid condition if condition not in (HW_ACCESS, HW_EXECUTE, HW_WRITE): return False # Check for available slots if not self.hardware_breakpoints.has_key(0): available = 0 elif not self.hardware_breakpoints.has_key(1): available = 1 elif not self.hardware_breakpoints.has_key(2): available = 2 elif not self.hardware_breakpoints.has_key(3): available = 3 else: return False # We want to set the debug register in every thread for thread_id in self.enumerate_threads(): context = self.get_thread_context(thread_id=thread_id) # Enable the appropriate flag in the DR7 # register to set the breakpoint context.Dr7 |= 1 << (available * 2) # Save the address of the breakpoint in the # free register that we found if available == 0: context.Dr0 = address elif available == 1: context.Dr1 = address elif available == 2: context.Dr2 = address elif available == 3: context.Dr3 = address # Set the breakpoint condition context.Dr7 |= condition << ((available * 4) + 16) # Set the length context.Dr7 |= length << ((available * 4) + 18) # Set thread context with the break set h_thread = self.open_thread(thread_id) kernel32.SetThreadContext(h_thread,byref(context)) # update the internal hardware breakpoint array at the used # slot index. self.hardware_breakpoints[available] = (address,length,condition) return True
You can see that we select an open slot to store the
breakpoint by checking the global
hardware_breakpoints
dictionary. Once we have
obtained a free slot, we then assign the breakpoint address to the
slot and update the DR7 register with the appropriate flags that
will enable the breakpoint. Now that we have the mechanism to
support setting the breakpoints, let's update our event loop and add
an exception handler to support the INT1
interrupt.
... class debugger(): ... def get_debug_event(self): if self.exception == EXCEPTION_ACCESS_VIOLATION: print "Access Violation Detected." elif self.exception == EXCEPTION_BREAKPOINT: continue_status = self.exception_handler_breakpoint() elif self.exception == EXCEPTION_GUARD_PAGE: print "Guard Page Access Detected." elif self.exception == EXCEPTION_SINGLE_STEP: self.exception_handler_single_step() ... def exception_handler_single_step(self): # Comment from PyDbg: # determine if this single step event occurred in reaction to a # hardware breakpoint and grab the hit breakpoint. # according to the Intel docs, we should be able to check for # the BS flag in Dr6. but it appears that Windows # isn't properly propagating that flag down to us. if self.context.Dr6 & 0x1 and self.hardware_breakpoints.has_key(0): slot = 0 elif self.context.Dr6 & 0x2 and self.hardware_breakpoints.has_key(1): slot = 1 elif self.context.Dr6 & 0x4 and self.hardware_breakpoints.has_key(2): slot = 2 elif self.context.Dr6 & 0x8 and self.hardware_breakpoints.has_key(3): slot = 3 else: # This wasn't an INT1 generated by a hw breakpoint continue_status = DBG_EXCEPTION_NOT_HANDLED # Now let's remove the breakpoint from the list if self.bp_del_hw(slot): continue_status = DBG_CONTINUE print "[*] Hardware breakpoint removed." return continue_status def bp_del_hw(self,slot): # Disable the breakpoint for all active threads for thread_id in self.enumerate_threads(): context = self.get_thread_context(thread_id=thread_id) # Reset the flags to remove the breakpoint context.Dr7 &= ~(1 << (slot * 2)) # Zero out the address if slot == 0: context.Dr0 = 0x00000000 elif slot == 1: context.Dr1 = 0x00000000 elif slot == 2: context.Dr2 = 0x00000000 elif slot == 3: context.Dr3 = 0x00000000 # Remove the condition flag context.Dr7 &= ~(3 << ((slot * 4) + 16)) # Remove the length flag context.Dr7 &= ~(3 << ((slot * 4) + 18)) # Reset the thread's context with the breakpoint removed h_thread = self.open_thread(thread_id) kernel32.SetThreadContext(h_thread,byref(context)) # remove the breakpoint from the internal list. del self.hardware_breakpoints[slot] return True
This process is fairly straightforward; when an
INT1
is fired we check to see if any of the debug
registers are set up with a hardware breakpoint. If the debugger
detects that there is a hardware breakpoint at the exception
address, it zeros out the flags in DR7 and resets the debug register
that contains the breakpoint address. Let's see this process in
action by modifying our my_test.py script to
use hardware breakpoints on our
printf()
call.
import my_debugger from my_debugger_defines import * debugger = my_debugger.debugger() pid = raw_input("Enter the PID of the process to attach to: ") debugger.attach(int(pid)) printf = debugger.func_resolve("msvcrt.dll","printf") print "[*] Address of printf: 0x%08x" % printf debugger.bp_set_hw(printf,1,HW_EXECUTE) debugger.run()
This harness simply sets a breakpoint on the
printf()
call whenever it gets executed. The
length of the breakpoint is only a single byte. You will notice that
in this harness we imported the
my_debugger_defines.py file; this is so we can
access the HW_EXECUTE
constant, which provides a
little code clarity. When you run the script you should see output
similar to Example 3-4.
Example 3-4. Order of events for handling a hardware breakpoint
Enter the PID of the process to attach to:2504
[*] Address of printf: 0x77c4186a Event Code: 3 Thread ID: 3704 Event Code: 6 Thread ID: 3704 Event Code: 6 Thread ID: 3704 Event Code: 6 Thread ID: 3704 Event Code: 6 Thread ID: 3704 Event Code: 6 Thread ID: 3704 Event Code: 6 Thread ID: 3704 Event Code: 6 Thread ID: 3704 Event Code: 6 Thread ID: 3704 Event Code: 6 Thread ID: 3704 Event Code: 6 Thread ID: 3704 Event Code: 6 Thread ID: 3704 Event Code: 6 Thread ID: 3704 Event Code: 6 Thread ID: 3704 Event Code: 6 Thread ID: 3704 Event Code: 6 Thread ID: 3704 Event Code: 6 Thread ID: 3704 Event Code: 2 Thread ID: 2228 Event Code: 1 Thread ID: 2228 [*] Exception address: 0x7c901230 [*] Hit the first breakpoint. Event Code: 4 Thread ID: 2228Event Code: 1 Thread ID: 3704
[*] Hardware breakpoint removed.
You can see from the order of events that an exception gets thrown, and our handler removes the breakpoint. The loop should continue to execute after the handler is finished. Now that we have support for soft and hardware breakpoints, let's wrap up our lightweight debugger with memory breakpoints.
The final feature that we are going to implement is the memory
breakpoint. First, we are simply going to query a section of memory to
determine where its base address is (where the page starts in virtual
memory). Once we have determined the page size, we will set the permissions of that page so that it acts as a guard
page. When the CPU attempts to access this memory, a
GUARD_PAGE_EXCEPTION
will be thrown. Using a
specific handler for this exception, we revert to the original
page permissions and continue execution.
In order for us to properly calculate the size of the page we
are manipulating, we have to first query the operating system itself
to retrieve the default page size. This is done by executing the
GetSystemInfo()
[20] function, which populates a
SYSTEM_INFO
[21] structure. This structure contains a
dwPageSize
member, which gives us the correct page
size for the system. We will implement this first step when our
debugger()
class is first instantiated.
... class debugger(): def __init__(self): self.h_process = None self.pid = None self.debugger_active = False self.h_thread = None self.context = None self.breakpoints = {} self.first_breakpoint= True self.hardware_breakpoints = {} # Here let's determine and store # the default page size for the system system_info = SYSTEM_INFO() kernel32.GetSystemInfo(byref(system_info)) self.page_size = system_info.dwPageSize ...
Now that we have captured the default page size, we are ready
to begin querying and manipulating page permissions. The first
step is to query the page that contains the address of the memory
breakpoint we wish to set. This is done by using the
VirtualQueryEx()
[22] function call, which populates a
MEMORY_BASIC_INFORMATION
[23] structure with the characteristics of the memory page
we queried. Following are the definitions for both the function and
the resulting structure:
SIZE_T WINAPI VirtualQuery( HANDLE hProcess, LPCVOID lpAddress, PMEMORY_BASIC_INFORMATION lpBuffer, SIZE_T dwLength ); typedef struct MEMORY_BASIC_INFORMATION{ PVOID BaseAddress; PVOID AllocationBase; DWORD AllocationProtect; SIZE_T RegionSize; DWORD State; DWORD Protect; DWORD Type; }
Once the structure has been populated, we will use the
BaseAddress
value as the starting point to begin
setting the page permission. The function that actually sets the
permission is VirtualProtectEx()
,[24] which has the following prototype:
BOOL WINAPI VirtualProtectEx( HANDLE hProcess, LPVOID lpAddress, SIZE_T dwSize, DWORD flNewProtect, PDWORD lpflOldProtect );
So let's get down to code. We are going to create a global
list of guard pages that we have explicitly set as well as a global
list of memory breakpoint addresses that our exception handler will
use when the GUARD_PAGE_EXCEPTION
gets thrown.
Then we set the permissions on the address and surrounding memory
pages (if the address straddles two or more memory pages).
... class debugger(): def __init__(self): ... self.guarded_pages = [] self.memory_breakpoints = {} ... def bp_set_mem (self, address, size): mbi = MEMORY_BASIC_INFORMATION() # If our VirtualQueryEx() call doesn't return # a full-sized MEMORY_BASIC_INFORMATION # then return False if kernel32.VirtualQueryEx(self.h_process, address, byref(mbi), sizeof(mbi)) < sizeof(mbi): return False current_page = mbi.BaseAddress # We will set the permissions on all pages that are # affected by our memory breakpoint. while current_page <= address + size: # Add the page to the list; this will # differentiate our guarded pages from those # that were set by the OS or the debuggee process self.guarded_pages.append(current_page) old_protection = c_ulong(0) if not kernel32.VirtualProtectEx(self.h_process, current_page, size, mbi.Protect | PAGE_GUARD, byref(old_protection)): return False # Increase our range by the size of the # default system memory page size current_page += self.page_size # Add the memory breakpoint to our global list self.memory_breakpoints[address] = (address, size, mbi) return True
Now you have the ability to set a memory breakpoint. If you
try it out in its current state by using our
printf()
looper, you should get output that
simply says Guard Page Access Detected
. The nice
thing is that when a guard page is accessed and the exception is
thrown, the operating system actually removes the protection on that
page of memory and allows you to continue execution. This saves you
from creating a specific handler to deal with it; however, you could
build logic into the existing debug loop to perform certain actions
when the breakpoint is hit, such as restoring the breakpoint,
reading memory at the location where the breakpoint is set, pouring
you a fresh coffee, or whatever you please.
[16] See MSDN ReadProcessMemory Function (http://msdn2.microsoft.com/en-us/library/ms680553.aspx).
[17] See MSDN WriteProcessMemory Function (http://msdn2.microsoft.com/en-us/library/ms681674.aspx).
[18] See MSDN GetProcAddress Function (http://msdn2.microsoft.com/en-us/library/ms683212.aspx).
[19] See MSDN GetModuleHandle Function (http://msdn2.microsoft.com/en-us/library/ms683199.aspx).
[20] See MSDN GetSystemInfo Function (http://msdn2.microsoft.com/en-us/library/ms724381.aspx).
[21] See MSDN SYSTEM_INFO Structure (http://msdn2.microsoft.com/en-us/library/ms724958.aspx).
[22] See MSDN VirtualQueryEx Function (http://msdn2.microsoft.com/en-us/library/aa366907.aspx).
[23] See MSDN MEMORY_BASIC_INFORMATION Structure (http://msdn2.microsoft.com/en-us/library/aa366775.aspx).
[24] See MSDN VirtualProtectEx Function (http://msdn.microsoft.com/en-us/library/aa366899(vs.85).aspx).