A.3 Type Conversions in C

The C programming language is quite flexible in handling different data types. For example, in C it’s easy to convert a character array into a signed integer. There are two types of conversion: implicit and explicit. In programming languages like C, implicit type conversion occurs when the compiler automatically converts a variable to a different type. This usually happens when the initial variable type is incompatible with the operation you are trying to perform. Implicit type conversions are also referred to as coercion.

Explicit type conversion, also known as casting, occurs when the programmer explicitly codes the details of the conversion. This is usually done with the cast operator.

Here is an example of an implicit type conversion (coercion):

[..]
unsigned int user_input = 0x80000000;
signed int   length     = user_input;
[..]

In this example, an implicit conversion occurs between unsigned int and signed int.

And here is an example of an explicit type conversion (casting):

[..]
char       cbuf[] = "AAAA";
signed int si     = *(int *)cbuf;
[..]

In this example, an explicit conversion occurs between char and signed int.

Type conversions can be very subtle and cause a lot of security bugs. Many of the vulnerabilities related to type conversion are the result of conversions between unsigned and signed integers. Below is an example:

Example A-3. A signed/unsigned conversion that leads to a vulnerability (implicit.c)

01    #include <stdio.h>
02
03    unsigned int
04    get_user_length (void)
05    {
06        return (0xffffffff);
07    }
08
09    int
10    main (void)
11    {
12        signed int length = 0;
13
14        length = get_user_length ();
15
16        printf ("length: %d %u (0x%x)\n", length, length, length);
17
18        if (length < 12)
19            printf ("argument length ok\n");
20        else
21            printf ("Error: argument length too long\n");
22
23        return 0;
24    }

The source code in Example A-3 contains a signed/unsigned conversion vulnerability that is quite similar to the one I found in FFmpeg (see Chapter 4). Can you spot the bug?

In line 14, a length value is read in from user input and stored in the signed int variable length. The get_user_length() function is a dummy that always returns the “user input value” 0xffffffff. Let’s assume this is the value that was read from the network or from a data file. In line 18, the program checks whether the user-supplied value is less than 12. If it is, the string “argument length ok” will be printed on the screen. Since length gets assigned the value 0xffffffff and this value is much bigger than 12, it may seem obvious that the string will not be printed. However, let’s see what happens if we compile and run the program under Windows Vista SP2:

C:\Users\tk\BHD>cl /nologo implicit.c
implicit.c

C:\Users\tk\BHD>implicit.exe
length: −1 4294967295 (0xffffffff)
argument length ok

As you can see from the output, line 19 was reached and executed. How did this happen?

On a 32-bit machine, an unsigned int has a range of 0 to 4294967295 and a signed int has a range of –2147483648 to 2147483647. The unsigned int value 0xffffffff (4294967295) is represented in binary as 1111 1111 1111 1111 1111 1111 1111 1111 (see Figure A-3). If you interpret the same bit pattern as a signed int, there is a change in sign that results in a signed int value of −1. The sign of a number is indicated by the sign bit, which is usually represented by the Most Significant Bit (MSB). If the MSB is 0, the number is positive, and if it is set to 1, the number is negative.

The role of the Most Significant Bit (MSB)

Figure A-3. The role of the Most Significant Bit (MSB)

To summarize: If an unsigned int is converted to a signed int value, the bit pattern isn’t changed, but the value is interpreted in the context of the new type. If the unsigned int value is in the range 0x80000000 to 0xffffffff, the resulting signed int will become negative (see Figure A-4).

This was only a brief introduction to implicit and explicit type conversions in C/C++. For a complete description of type conversions in C/C++ and associated security problems, see Mark Dowd, John McDonald, and Justin Schuh’s The Art of Software Security Assessment: Identifying and Avoiding Software Vulnerabilities (Addison-Wesley, 2007).

Integer type conversion: unsigned int to signed int

Figure A-4. Integer type conversion: unsigned int to signed int

Note

I used Debian Linux 6.0 (32-bit) as a platform for all the following steps.