Fuzzing is simply a way to feed garbage to an application hoping for a crash, kind of like going to an all-you-can-eat buffet until you end up sick. The goal is to send valid enough data so it would be processed by the application while also sending enough bad data (format string, buffer overflow) that the chance of crashing is still high enough.
In theory, fuzzing is the worst way of testing software because there is no way to achieve coverage with pseudorandom data in an acceptable finite time. But while fuzzing might not be the best tool for uncovering all bugs in software, it can be very useful for the reader because of one simple reason: it works for finding an exploitable bug, and it works fast.
I didn't believe it at first. After all, I have coded (in C and assembly/binary) and I know very well what security bugs look like. Why feed my software with garbage hoping for a crash? Well, the first time I saw a good fuzzer on the job, it was like magic. It worked. It found buffer overflows faster than I was able to look at them in the code.
Should you use a fuzzer or not? It's a tricky question. Just remember it is not a silver bullet, and it could lose more time than it saves, but all in all fuzzing is a very useful tool, as long as you are aware of its advantages and disadvantages.
A fuzzer is useful when you want to:
Find some exploitable bugs but not all of them.
Find bugs in multiple software, when using the same protocol.
Quickly test closed source software, without looking at the source or binary.
Easily automate.
Easily parallelize.
Use something that is protocol and file-type-dependent, but software-independent.
A fuzzer is not useful when:
Finding measures for assurance of software quality.
Achieving code coverage.
Achieving quality metric.
Testing multiple conditions cases (fuzzing usually affects only one variable at a time).
Nonatomic processing (application can't be reinitialized per file or session).
There are plenty of fuzzers available—some specific to a protocol, others very generic. The first thing to look at when choosing a fuzzer is to have the best ratio for time writing the fuzzer versus bugs found. The number of bugs found depends mostly on two things: the aptitude of the fuzzer to comply with the protocol standard (ensuring that the information is processed and not discarded as invalid), and the quality of the fuzz string/value or payload (i.e., what we are injecting into the application). To ensure compliance to the protocol standard, the fuzzer must respect basic validity checks such as payload size, checksum, and hash. If these checks are not respected, those requests or other payloads are discarded as invalid by the application's first layer and only offer a very small code base coverage. This is where a block-based fuzzer excels compared to other fuzzers, because of its ability to process subblocks of information and place the correct size value or checksum at the correct place.
Even the simplest protocols have size or checksum. For example, in the following payload, if the length of the string aaaaaaaaaaaa
is not equal to the provided length in the header, most servers will discard the communication as invalid without even processing the content. It is critical to keep your data consistent if you want to reach the code underneath the first parsing layer.
POST / HTTP/1.1 Host: 192.168.248.133 User-Agent: Mozilla/5.0 Transfer-encoding: chunked Keep-Alive 300 12 aaaaaaaaaaaaaaaaaaaaaaaaaaaa
But the quality of the fuzz string/value is the real key. It triggers a bug that other test cases do not. Testing all the possible cases is just not an option, and the more tests you have, the more time the fuzzer needs to run. So for example, if you test all the cases for a string value from A to A repeated 65,535 times, it creates around 65,000 test cases. However, testing for only length values such as 16±5 (i.e., 11,12,13,14,15,16,17,18,19,20,21), 256±5, 1024±5, 2048±5, 4096±5, 16k±5, 32k±5, and 65k±5 generates 80 test cases and offers coverage for most of the boundary conditions, while being around 800 times faster. The same thing applies to the other tests. A good suite of tests cases should be able to cover at least these varieties of bugs:
Buffer overflow
Format string
Directory traversal
Signed/unsigned value (e.g., negative payload size, huge value)
Cross-site scripting
Injection (e.g., command, SQL)
Integer overflow
Other interesting points to consider before choosing or writing a fuzzer is the ability to load external libraries to achieve tasks such as calculating hash, checksum, and cryptography; and the ability to use multiple outputs such as network, file, and syscall.