In a software program, buffer overflow occurs when a program, while writing data to a buffer, overruns the buffer size allocated and starts overwriting data to adjacent memory locations.
A buffer can be considered a temporary area in the memory allocated to a program to store and retrieve data when needed.
Buffer overflows have been known to be exploited since long back.
When exploiting buffer overflows, our main focus is on overwriting some control information so that the flow of control of the program changes, which will allow our code to take control of the program.
Here is a diagram that will give us a basic idea of an overflow happening in a buffer:
From the preceding diagram, we can assume this is what a program looks like. Since it is a stack, it starts from bottom and moves toward the top of the stack.
Seeing the preceding diagram, we also notice that the program has a fixed buffer to store 16 letters/bytes of data.
We first enter the 8 characters (1 char=1 byte); on the right-hand side of the diagram, we can see that they have been written in the buffer of the program's memory.
Let's see what happens when we write 20 characters into the program:
We can see that data is correctly written upto 16 characters, but the last 4 characters have now gone out of the buffer and have overwritten the values stored in the Return Address of the program. This is where a classic buffer overflow occurs.
Let's look at a live example; we will take a sample code:
#include <stdio.h> #include <string.h> #include <stdlib.h> int main(int argc, char *argv[]) { char buffer[5]; if (argc < 2) { printf("strcpy() NOT executed....\n"); printf("Syntax: %s <characters>\n", argv[0]); exit(0); } strcpy(buffer, argv[1]); printf("buffer content= %s\n", buffer); // you may want to try strcpy_s() printf("strcpy() executed...\n"); return 0; }
The preceding program simply takes an input at runtime and copies it into a variable called buffer. We can see that the size of the variable buffer is set to 5.
We now compile it using this command:
gcc program.c -o program
We need to be careful as gcc by default has inbuilt security features, which prevent buffer overflows.
We run the program using this command:
./program 1234
We see that it has stored the data and we get the output.
Let's now run this:
./program 12345
We will see the program exits as a segmentation fault. This is the enabled security feature of gcc.
We will learn more about the return address in the next recipe. However, overwriting the return address with our own code can cause a program to behave differently from its usual execution and helps us in exploiting the vulnerability.
Fuzzing is the easiest way to discover buffer overflows in a program. There are various fuzzers available in Kali, or we can write a custom script to make our own, depending on the type of program we have.
Once fuzzing is done and a crash occurs, our next step is to debug the program to find the exact part where a program crashes and how we can use it to our advantage.
Again, there are multiple debuggers available online. My personal favorite for Windows is Immunity Debugger (Immunity Inc.). Kali also comes with an inbuilt debugger, GDB. It is a command-line debugger.
Before we jump any further into more exciting topics, note that there are two types of overflows that usually happen in a program.
There are mainly two types of buffer overflows:
- Stack-based overflows
- Heap-based overflows
We will be covering these in more detail in the later part of the chapter. For now, let's clear up some basics, that will help us in exploiting overflow vulnerabilities.