IDAPython is fully IDC compliant, which means any function call that IDC[55] supports you can also use in IDAPython. We will cover some of the functions that you will commonly use when writing IDAPython scripts in short order. These should provide a solid foundation for you to begin developing your own scripts. The IDC language supports well over 100 function calls, so this is far from an exhaustive list, but you are encouraged to explore it in depth at your leisure.
The following are a couple of utility functions that will come in handy in a lot of your IDAPython scripts:
Obtains the address of where your cursor is currently positioned on the IDA screen. This allows you to pick a known starting point to start your script.
Returns the MD5 hash of the binary you have loaded in IDA, which is useful for tracking whether a binary has changed from version to version.
A binary in IDA is broken down into segments, with each segment
having a specific class (CODE, DATA, BSS, STACK,
CONST
, or XTRN
). The following functions
provide a way to obtain information about the segments that are
contained within the binary:
Returns the starting address of the first segment in the binary.
Returns the starting address of the next segment
in the binary or BADADDR
if there are no more
segments.
Returns the starting address of the segment based on the
segment name. For instance, calling it with
.text
as a parameter will return the starting
address of the code segment for the binary.
Returns the end of a segment based on an address contained within that segment.
Returns the start of a segment based on an address contained within that segment.
Returns the name of the segment based on any address within that segment.
Returns a list of starting addresses for all of the segments in the target binary.
Iterating over all the functions in a binary and determining function boundaries are tasks that you will encounter frequently when scripting. The following routines are useful when dealing with functions inside a target binary:
Returns a list of all function start addresses contained between
StartAddress
and
EndAddress
.
Returns a list of function chunks, or basic blocks. Each list item
is a tuple of ( chunk start, chunk end )
,
which shows the beginning and end points of each chunk.
Converts an address within a function to a string that shows the function name and the byte offset into the function.
Given an address, returns the name of the function the address belongs to.
Finding code and data cross-references inside a binary is extremely useful when determining data flow and possible code paths to interesting portions of a target binary. IDAPython has a host of functions used to determine various cross references. The most commonly used ones are covered here.
Returns a list of code references to the given address.
The boolean Flow
flag tells IDAPython whether or not to follow normal code
flow when determining the cross-references.
Returns a list of code references from the given address.
Returns a list of data references to the given address. Useful for tracking global variable usage inside the target binary.
Returns a list of data references from the given address.
One very cool feature that IDAPython supports is the ability to define a debugger hook within IDA and set up event handlers for the various debugging events that may occur. Although IDA is not commonly used for debugging tasks, there are times when it is easier to simply fire up the native IDA debugger than switch to another tool. We will use one of these debugger hooks later on when creating a simple code coverage tool. To set up a debugger hook, you first define a base debugger hook class and then define the various event handlers within this class. We'll use the following class as an example:
class DbgHook(DBG_Hooks): # Event handler for when the process starts def dbg_process_start(self, pid, tid, ea, name, base, size): return # Event handler for process exit def dbg_process_exit(self, pid, tid, ea, code): return # Event handler for when a shared library gets loaded def dbg_library_load(self, pid, tid, ea, name, base, size): return # Breakpoint handler def dbg_bpt(self, tid, ea): return
This class contains some common debug event handlers that you can use when creating simple debugging scripts in IDA. To install your debugger hook use the following code:
debugger = DbgHook() debugger.hook()
Now run the debugger, and your hook will catch all of the debugging events, allowing you to have a very high level of control over IDA's debugger. Here are a handful of helper functions that you can use during a debugging run: