Debugging symbols provide limited information from the source code to help understand assembly code. The symbols provided by Microsoft contain names for certain functions and variables.
A symbol in this context is simply a name for a particular memory
address. Most symbols provide a name for addresses that represent functions, but some provide a name
for addresses that represent data addresses. For example, without symbol information, the function
at address 8050f1a2 will not be labeled. If you have symbol information configured, WinDbg will tell
you that the function is named MmCreateProcessAddressSpace
(assuming that was the name of the function at that address). With just an address, you
wouldn’t know much about a function, but the name tells us that this function creates address
space for a process. You can also use the symbol name to find functions and data in memory.
The format for referring to a symbol in WinDbg is as follows:
moduleName
!symbolName
This syntax can be used anywhere that normally has an address. The moduleName
is the name of the .exe,
.dll, or .sys file that contains the symbol without the
extension, and the symbolName
is the name
associated with the address. However, ntoskrnl.exe is a special case and the
module name is nt
, not ntoskrnl
. For example, if you want to look at disassembly of the NtCre
ateProcess
function in
ntoskrnl.exe, you would use the disassemble command u
(which stands for unassemble) with the parameter nt!NtCreateProcess
. If you don’t specify a library name, WinDbg will search through
all of the loaded modules for a matching symbol. This can take a long time because it must load and
search symbols for every module.
The bu
command allows you to use symbols to set a deferred
breakpoint on code that isn’t yet loaded. A deferred breakpoint is a
breakpoint that will be set when a module is loaded that matches a specified name. For example,
the command bu newModule!exportedFunction
will
instruct WinDbg to set a breakpoint on exportedFunction
as soon
as a module is loaded with the name newModule
. When analyzing
kernel modules, it is particularly useful to combine this with the $iment
command, which determines the entry point of a given module. The command bu $iment(
driverName
)
will set a breakpoint on the
entry point of a driver before any of the driver’s code has a chance to run.
The x
command allows you to search for functions or symbols
using wildcards. For example, if you’re looking for kernel functions that perform process
creation, you can search for any function within ntoskrnl.exe that includes the
string CreateProcess
. The command x
nt!*CreateProcess*
will display exported functions as well as internal functions. The
following is the output for x nt!*CreateProcess*
.
0:003> x nt!*CreateProcess*
805c736a nt!NtCreateProcessEx = <no type information>
805c7420 nt!NtCreateProcess = <no type information>
805c6a8c nt!PspCreateProcess = <no type information>
804fe144 nt!ZwCreateProcess = <no type information>
804fe158 nt!ZwCreateProcessEx = <no type information>
8055a300 nt!PspCreateProcessNotifyRoutineCount = <no type information>
805c5e0a nt!PsSetCreateProcessNotifyRoutine = <no type information>
8050f1a2 nt!MmCreateProcessAddressSpace = <no type information>
8055a2e0 nt!PspCreateProcessNotifyRoutine = <no type information>
Another useful command is the ln
command, which will list
the closest symbol for a given memory address. This can be used to determine to which function a
pointer is directed. For example, let’s say we see a call
function to address 0x805717aa and we want to know the purpose of the code at that address. We could
issue the following command:
0:002> ln 805717aa
kd> ln ntreadfile
❶ (805717aa) nt!NtReadFile | (80571d38) nt!NtReadFileScatter
Exact matches:
❷ nt!NtReadFile = <no type information>
The first line ❶ shows the two closest matches, and the last line ❷ shows the exact match. Only the first line is displayed if there is no exact match.
The Microsoft symbols also include type information for many structures, including internal types that are not documented elsewhere. This is useful for a malware analyst, since malware often manipulates undocumented structures. Example 10-2 shows the first few lines of a driver object structure, which stores information about a kernel driver.
Example 10-2. Viewing type information for a structure
0:000> dt nt!_DRIVER_OBJECT
kd> dt nt!_DRIVER_OBJECT
+0x000 Type : Int2B
+0x002 Size : Int2B
+0x004 DeviceObject : Ptr32 _DEVICE_OBJECT
+0x008 Flags : Uint4B
❶ +0x00c DriverStart : Ptr32 Void
+0x010 DriverSize : Uint4B
+0x014 DriverSection : Ptr32 Void
+0x018 DriverExtension : Ptr32 _DRIVER_EXTENSION
+0x01c DriverName : _UNICODE_STRING
+0x024 HardwareDatabase : Ptr32 _UNICODE_STRING
+0x028 FastIoDispatch : Ptr32 _FAST_IO_DISPATCH
+0x02c DriverInit : Ptr32 long
+0x030 DriverStartIo : Ptr32 void
+0x034 DriverUnload : Ptr32 void
+0x038 MajorFunction : [28] Ptr32 long
The structure names hint at what data is stored within the structure. For example, at
offset 0x00c
❶ there is a pointer that reveals where the driver is
loaded in memory.
WinDbg allows you to overlay data onto the structure. Let’s say that we know there is a driver object at offset 828b2648, and we want to show the structure along with each of the values from a particular driver. Example 10-3 shows how to accomplish this.
Example 10-3. Overlaying data onto a structure
kd> dt nt!_DRIVER_OBJECT 828b2648
+0x000 Type : 4
+0x002 Size : 168
+0x004 DeviceObject : 0x828b0a30 _DEVICE_OBJECT
+0x008 Flags : 0x12
+0x00c DriverStart : 0xf7adb000
+0x010 DriverSize : 0x1080
+0x014 DriverSection : 0x82ad8d78
+0x018 DriverExtension : 0x828b26f0 _DRIVER_EXTENSION
+0x01c DriverName : _UNICODE_STRING "\Driver\Beep"
+0x024 HardwareDatabase : 0x80670ae0 _UNICODE_STRING "\REGISTRY\MACHINE\
HARDWARE\DESCRIPTION\SYSTEM"
+0x028 FastIoDispatch : (null)
+0x02c DriverInit : ❶0xf7adb66c long Beep!DriverEntry+0
+0x030 DriverStartIo : 0xf7adb51a void Beep!BeepStartIo+0
+0x034 DriverUnload : 0xf7adb620 void Beep!BeepUnload+0
+0x038 MajorFunction : [28] 0xf7adb46a long Beep!BeepOpen+0
This is the beep driver, which is built into Windows to make a beeping noise when something is
wrong. We can see that the initialization function that is called when the driver is loaded is
located at offset 0xf7adb66c
❶. If this were a malicious driver, we would want to see
what code was located at that address because that code is always called first when the driver is
loaded. The initialization function is the only function called every time a driver is loaded.
Malware will sometimes place its entire malicious payload in this function.
Symbols are specific to the version of the files being analyzed, and can change with every update or hotfix. When configured properly, WinDbg will query Microsoft’s server and automatically get the correct symbols for the files that are currently being debugged. You can set the symbol file path by selecting File ▶ Symbol File Path. To configure WinDbg to use the online symbol server, enter the following path:
SRV*c:\websymbols*http://msdl.microsoft.com/download/symbols
The SRV
configures a server, the path
c:\websymbols is a local cache for symbol information, and the URL is the fixed
location of the Microsoft symbol server.
If you’re debugging on a machine that is not continuously connected to the Internet, you can manually download the symbols from Microsoft. Download the symbols specific to the OS, service pack, and architecture that you are using. The symbol files are usually a couple hundred megabytes because they contain the symbol information for all the different hotfix and patch versions for that OS and service pack.