You want an alternative to using the standard C string-manipulation functions to help avoid buffer overflows (see Recipe 3.3), format-string problems (see Recipe 3.2), and the use of unchecked external input.
Use the SafeStr library, which is available from http://www.zork.org/safestr/.
The SafeStr library provides an implementation of dynamically sizable strings in C. In addition, the library also performs reference counting and accounting of the allocated and actual sizes of each string. Any attempt to increase the actual size of a string beyond its allocated size causes the library to increase the allocated size of the string to a size at least as large. Because strings managed by SafeStr ("safe strings") are dynamically sized, safe strings are not a source of potential buffer overflows. (See Recipe 3.3.)
Safe strings use the type
safestr_t
, which can actually be cast to the normal
C-style string type, char *
, though we strongly
recommend against doing so where it can be avoided. In fact, the only
time you should ever cast a safe string to a normal C-style string is
for read-only purposes. This is also the only reason why the
safestr_t
type was designed in a way that allows
casting to normal C-style strings.
Casting a safe string to a normal C-style string and modifying it using C-style string-manipulation functions or other means defeats the protections and accounting afforded by the SafeStr library.
The SafeStr library provides a rich set of API functions to manipulate the strings it manages. The large number of functions prohibits us from enumerating them all here, but note that the library comes with complete documentation in the form of Unix man pages, HTML, and PDF. Table 3-1 lists the functions that have C equivalents, along with those equivalents.
Table 3-1. SafeStr API functions and equivalents for normal C strings
SafeStr function |
C function |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
You can typically create safe strings in any of the following three ways:
SAFESTR_ALLOC( )
Allocates a resizable string with an initial allocation size in bytes as specified by its only argument. The string returned will be an empty string (actual size zero). Normally the size allocated for a string will be larger than the actual size of the string. The library rounds memory allocations up, so if you know that you will need a large string, it is worth allocating it with a large initial allocation size up front to avoid reallocations as the actual string length grows.
SAFESTR_CREATE( )
Creates a resizable string from the normal C-style string passed as its only argument. This is normally the appropriate way to convert a C-style string to a safe string.
SAFESTR_TEMP( )
Creates a temporary resizable string from the normal C-style string
passed as its only argument. SAFESTR_CREATE( )
and
SAFESTR_TEMP( )
behave similarly, except that a
string created by SAFESTR_TEMP( )
will be
automatically destroyed by the next SafeStr
function that uses it. The only exception is
safestr_reference( )
, which increments the
reference count on the string, allowing it to survive until
safestr_release( )
or safestr_free(
)
is called to decrement the string's
reference count.
People are sometimes confused about when actually to use
SAFESTR_TEMP( )
, as well as how to use it
properly. Use SAFESTR_TEMP( )
when you need to
pass a constant string as an argument to a function that is expecting
a safestr_t
. A perfect example of such a case
would be safestr_sprintf( )
, which has the
following signature:
int safestr_sprintf(safestr_t *output, safestr_t *fmt, ...);
The string that specifies the format must be a safe string, but
because you should always use constant strings for the format
specification (see Recipe 3.2), you should use SAFESTR_TEMP(
)
. The alternative is to use SAFESTR_CREATE(
)
to create the string before calling
safestr_sprintf( )
, and free it immediately
afterward with safestr_free( )
.
int i = 42; safestr_t fmt, output; output = SAFESTR_ALLOC(1); /* Instead of doing this: */ fmt = SAFESTR_CREATE("The value of i is %d.\n"); safestr_sprintf(&output, fmt, i); safestr_free(fmt); /* You can do this: */ safestr_sprintf(&output, SAFESTR_TEMP("The value of i is %d.\n"), i);
When using temporary strings, remember that the temporary string will
be destroyed automatically after a call to any
SafeStr API function except
safestr_reference(
)
, which will increment the
string's reference count. If a temporary
string's reference count is incremented, the string
will then survive any number of API calls until its reference count
is decremented to the extent that it will be destroyed. The API
functions safestr_release(
)
and safestr_free(
)
may be used interchangeably to decrement a string's
reference count.
For example, if you are writing a function that accepts a
safestr_t
as an argument (which may or may not be
passed as a temporary string) and you will be performing multiple
operations on the string, you should increment the
string's reference count before operating on it, and
decrement it again when you are finished. This will ensure that the
string is not prematurely destroyed if a temporary string is passed
in to the function.
void some_function(safestr_t *base, safestr_t extra) { safestr_reference(extra); if (safestr_length(*base) + safestr_length(extra) < 17) safestr_append(base, extra); safestr_release(extra); }
In this example, if you omitted the calls to
safestr_reference( )
and safestr_release(
)
, and if extra
was a temporary string,
the call to safestr_length( )
would cause the
string to be destroyed. As a result, the safestr_append(
)
call would then be operating on an invalid
safestr_t
if the combined length of
base
and extra
were less than
17.
Finally, the SafeStr library also tracks the
trustworthiness of strings. A string can be either trusted or
untrusted. Operations that combine strings result in untrusted
strings if any one of the strings involved in the combination is
untrusted; otherwise, the result is trusted. There are few places in
SafeStr's API where the
trustworthiness of a string is tested, but the function
safestr_istrusted(
)
allows you to test strings yourself.
The strings that result from using SAFESTR_CREATE(
)
or SAFESTR_TEMP( )
are untrusted. You
can use SAFESTR_TEMP_TRUSTED(
)
to create temporary strings that are trusted.
The trustworthiness of an existing string can be altered using
safestr_trust( )
to make it trusted or
safestr_untrust( )
to make it untrusted.
The main reason to track the trustworthiness of a string is to monitor the flow of external inputs. Safe strings created from external data should initially be untrusted. If you later verify the contents of a string, ensuring that it contains nothing dangerous, you can then mark the string as trusted. Whenever you need to use a string to perform some potentially dangerous operation (for example, using a string in a command-line argument to an external program), check the trustworthiness of the string before you use it, and fail appropriately if the string is untrusted.
SafeStr: http://www.zork.org/safestr/