Once you have found a memory corruption vulnerability, you can use a variety of techniques to gain control over the instruction pointer register of the vulnerable process. One of these techniques, called GOT overwrite, works by manipulating an entry in the so-called Global Offset Table (GOT) of an Executable and Linkable Format (ELF)[90] object to gain control over the instruction pointer. Since this technique relies on the ELF file format, it works only on platforms supporting this format (such as Linux, Solaris, or BSD).
The GOT is located in an ELF-internal data section called .got
. Its purpose is to redirect position-independent address calculations to an absolute location, so it stores the absolute location of function-call symbols used in dynamically linked code. When a program calls a library function for the first time, the runtime link editor (rtld
) locates the appropriate symbol and relocates it to the GOT. Every new call to that function passes the control directly to that location, so rtld
isn’t called for that function anymore. Example A-4 illustrates this process.
Example A-4. Example code used to demonstrate the function of the Global Offset Table (got.c)
01 #include <stdio.h> 02 03 int 04 main (void) 05 { 06 int i = 16; 07 08 printf ("%d\n", i); 09 printf ("%x\n", i); 10 11 return 0; 12 }
The program in Example A-4 calls the printf()
library function two times. I compiled the program with debugging symbols and started it in the debugger (see Section B.4 for a description of the following debugger commands):
linux$gcc -g -o got got.c
linux$gdb -q ./got
(gdb)set disassembly-flavor intel
(gdb)disassemble main
Dump of assembler code for function main: 0x080483c4 <main+0>: push ebp 0x080483c5 <main+1>: mov ebp,esp 0x080483c7 <main+3>: and esp,0xfffffff0 0x080483ca <main+6>: sub esp,0x20 0x080483cd <main+9>: mov DWORD PTR [esp+0x1c],0x10 0x080483d5 <main+17>: mov eax,0x80484d0 0x080483da <main+22>: mov edx,DWORD PTR [esp+0x1c] 0x080483de <main+26>: mov DWORD PTR [esp+0x4],edx 0x080483e2 <main+30>: mov DWORD PTR [esp],eax0x080483e5 <main+33>: call 0x80482fc <printf@plt>
0x080483ea <main+38>: mov eax,0x80484d4 0x080483ef <main+43>: mov edx,DWORD PTR [esp+0x1c] 0x080483f3 <main+47>: mov DWORD PTR [esp+0x4],edx 0x080483f7 <main+51>: mov DWORD PTR [esp],eax0x080483fa <main+54>: call 0x80482fc <printf@plt>
0x080483ff <main+59>: mov eax,0x0 0x08048404 <main+64>: leave 0x08048405 <main+65>: ret End of assembler dump.
The disassembly of the main()
function shows the address of printf()
in the Procedure Linkage Table (PLT). Much as the GOT redirects position-independent address calculations to absolute locations, the PLT redirects position-independent function calls to absolute locations.
(gdb)x/1i 0x80482fc
0x80482fc <printf@plt>: jmp DWORD PTR ds:0x80495d8
The PLT entry jumps immediately into the GOT:
(gdb)x/1x 0x80495d8
0x80495d8 <_GLOBAL_OFFSET_TABLE_+20>:0x08048302
If the library function wasn’t called before, the GOT entry points back into the PLT. In the PLT, a relocation offset gets pushed onto the stack, and execution is redirected to the _init()
function. This is where rtld
gets called to locate the referenced printf()
symbol.
(gdb) x/2i 0x08048302
0x8048302 <printf@plt+6>: push 0x10
0x8048307 <printf@plt+11>: jmp 0x80482cc
Now let’s see what happens if printf()
gets called a second time. First, I defined a breakpoint just before the second call to printf()
:
(gdb)list 0
1 #include <stdio.h> 2 3 int 4 main (void) 5 { 6 int i = 16; 7 8 printf ("%d\n", i);9 printf ("%x\n", i);
10 (gdb)break 9
Breakpoint 1 at 0x80483ea: file got.c, line 9.
I then started the program:
(gdb) run
Starting program: /home/tk/BHD/got
16
Breakpoint 1, main () at got.c:9
9 printf ("%x\n", i);
After the breakpoint triggered, I disassembled the main
function again to see if the same PLT address was called:
(gdb)disassemble main
Dump of assembler code for function main: 0x080483c4 <main+0>: push ebp 0x080483c5 <main+1>: mov ebp,esp 0x080483c7 <main+3>: and esp,0xfffffff0 0x080483ca <main+6>: sub esp,0x20 0x080483cd <main+9>: mov DWORD PTR [esp+0x1c],0x10 0x080483d5 <main+17>: mov eax,0x80484d0 0x080483da <main+22>: mov edx,DWORD PTR [esp+0x1c] 0x080483de <main+26>: mov DWORD PTR [esp+0x4],edx 0x080483e2 <main+30>: mov DWORD PTR [esp],eax0x080483e5 <main+33>: call 0x80482fc <printf@plt>
0x080483ea <main+38>: mov eax,0x80484d4 0x080483ef <main+43>: mov edx,DWORD PTR [esp+0x1c] 0x080483f3 <main+47>: mov DWORD PTR [esp+0x4],edx 0x080483f7 <main+51>: mov DWORD PTR [esp],eax0x080483fa <main+54>: call 0x80482fc <printf@plt>
0x080483ff <main+59>: mov eax,0x0 0x08048404 <main+64>: leave 0x08048405 <main+65>: ret End of assembler dump.
The same address in the PLT was indeed called:
(gdb)x/1i 0x80482fc
0x80482fc <printf@plt>: jmp DWORD PTR ds:0x80495d8
The called PLT entry jumps immediately into the GOT again:
(gdb)x/1x 0x80495d8
0x80495d8 <_GLOBAL_OFFSET_TABLE_+20>:0xb7ed21c0
But this time, the GOT entry of printf()
has changed: It now points directly to the printf()
library function in libc
.
(gdb) x/10i 0xb7ed21c0
0xb7ed21c0 <printf>: push ebp
0xb7ed21c1 <printf+1>: mov ebp,esp
0xb7ed21c3 <printf+3>: push ebx
0xb7ed21c4 <printf+4>: call 0xb7ea1aaf
0xb7ed21c9 <printf+9>: add ebx,0xfae2b
0xb7ed21cf <printf+15>: sub esp,0xc
0xb7ed21d2 <printf+18>: lea eax,[ebp+0xc]
0xb7ed21d5 <printf+21>: mov DWORD PTR [esp+0x8],eax
0xb7ed21d9 <printf+25>: mov eax,DWORD PTR [ebp+0x8]
0xb7ed21dc <printf+28>: mov DWORD PTR [esp+0x4],eax
Now if we change the value of the GOT entry for printf()
, it’s possible to control the execution flow of the program when printf()
is called:
(gdb)set variable *(0x80495d8)=0x41414141
(gdb)x/1x 0x80495d8
0x80495d8 <_GLOBAL_OFFSET_TABLE_+20>:0x41414141
(gdb)continue
Continuing. Program received signal SIGSEGV, Segmentation fault.0x41414141 in ?? ()
(gdb)info registers eip
eip0x41414141
0x41414141
We have achieved EIP
control. For a real-life example of this exploitation technique, see Chapter 4.
To determine the GOT address of a library function, you can either use the debugger, as in the previous example, or you can use the objdump
or readelf
command:
linux$objdump -R got
got: file format elf32-i386 DYNAMIC RELOCATION RECORDS OFFSET TYPE VALUE 080495c0 R_386_GLOB_DAT __gmon_start__ 080495d0 R_386_JUMP_SLOT __gmon_start__ 080495d4 R_386_JUMP_SLOT __libc_start_main080495d8 R_386_JUMP_SLOT printf
linux$readelf -r got
Relocation section '.rel.dyn' at offset 0x27c contains 1 entries: Offset Info Type Sym.Value Sym. Name 080495c0 00000106 R_386_GLOB_DAT 00000000 __gmon_start__ Relocation section '.rel.plt' at offset 0x284 contains 3 entries: Offset Info Type Sym.Value Sym. Name 080495d0 00000107 R_386_JUMP_SLOT 00000000 __gmon_start__ 080495d4 00000207 R_386_JUMP_SLOT 00000000 __libc_start_main080495d8 00000307 R_386_JUMP_SLOT 00000000 printf
[90]
[90] For a description of ELF, see TIS Committee, Tool Interface Standard (TIS) Executable and Linking Format (ELF) Specification, Version 1.2, 1995, at http://refspecs.freestandards.org/elf/elf.pdf.