Error Reporting in C

Error reporting in C is accomplished through the use of an old mechanism named errno. Although errno shows its age and has some deficiencies, it generally gets the job done. You may wonder why you need an error-reporting mechanism at all, since most C functions have a handy return value that tells whether the call succeeded or failed. The answer is that a return value can warn you that a function didn't do what you wanted it to, but it may or may not tell you why. To make this more concrete, consider this code snippet:

FILE *fp
fp = fopen("myfile.dat", "r");
retval = fwrite(&data, sizeof(DataStruct), 1, fp);

Suppose you inspect retval and find it to be zero. From the man page, you see that fwrite() should return the number of items (not bytes or characters) written, so retval should be 1. How many different ways can fwrite() fail? Lots! For starters, the filesystem may be full, or you might not have write permissions on the file. In this case, however, there's a bug in the code that causes fwrite() to fail. Can you find it?^[18] An error-reporting system like errno can provide diagnostic information to help you figure out what happened in cases like this. (The operating system may announce certain errors, as well.)

Using errno

System and library calls that fail usually set a globally defined integer variable named errno. On most GNU/Linux systems errno is declared in /usr/include/errno.h, so by including this header file, you don't have to declare extern int errno in your own code.

When a system or library call fails, it sets errno to a value that indicates the type of failure. It's up to you to check the value of errno and take the appropriate action. Consider the following code:

Example Listing 7-2. double-trouble.c

#include <stdio.h>
#include <errno.h>
#include <math.h>

int main(void)
{
   double trouble = exp(1000.0);
   if (errno) {
      printf("trouble: %f (errno: %d)\n", trouble, errno);
      exit(-1);
   }

   return 0;
}

On our system, exp(1000.0) is larger than what a double can store, so the assignment results in a floating-point overflow. From the output, you see that an errno value of 34 indicates a floating-point overflow error:

$ ./a.out
trouble: inf (errno: 34)

This pretty much illustrates how errno works. By convention, when a library function or system call fails, it sets errno to a value that describes why the call failed. You just saw that the value 34 means the result of exp(1000.0) was not representable by a double, and there are lots of other codes that indicate underflow, permission problems, file not found, and other error conditions. However, before you start using errno in your programs, there are some issues you need to be aware of.

First, code that uses errno may not be completely portable. For example, the ISO C standard only defines a few error codes, and the POSIX standard defines quite a few more. You can see which error codes are defined by which standards in the errno man page. Moreover, the standards don't specify numeric values, like 34, for the error codes. They prescribe symbolic error codes, which are macro constants whose names are prefixed by E and which are defined in the errno header file (or in files included by the errno header). The only thing about their values that is consistent across platforms is that they are nonzero. Thus, you can't assume that a particular value always indicates the same error condition.^[19] You should always use the symbolic names to refer to errno values.

In addition to the ISO and POSIX errno values, specific implementations of the C library, like GNU's glibc, can define even more errno values. On GNU/Linux, the errno section of the of libc info pages^[20] is the canonical source for all the available errno values on that platform: ISO, POSIX, and glibc. Here are some error code defines we pulled from /usr/include/asm/errno.h off of a GNU/Linux machine:

#define EPIPE        32 /* Broken pipe */
#define EDOM         33 /* Math arg out of domain of func */
#define ERANGE       34 /* Math result not representable */
#define EDEADLK      35 /* Resource deadlock would occur */
#define ENAMETOOLONG 36 /* File name too long */
#define ENOLCK       37 /* No record locks available */
#define ENOSYS       38 /* Function not implemented */

Next, there are some important facts to remember about how errno is used. errno can be set by any library function or system call, whether it fails or succeeds! Because even successful function calls can set errno, you cannot rely on errno to tell you whether an error occurs. You can only rely on it to tell you why an error happened. Therefore, the safest way to use errno is as follows:^[21]

Perform the call to the library or system function.
Use the function's return value to determine whether or not an error occured.
If an error occured, use errno to determine why.

In pseudocode:

retval = systemcall();

if (retval indicates an error) {
   examine_errno();
   take_action();
}

This brings us to man pages. Suppose you're coding and you want to throw in some error checking after a call to ptrace(). Step two says to use ptrace()'s return value to determine if an error has occured. If you're like us, you don't have the return values of ptrace() memorized. What can you do? Every man page has a section named "Return Value." You can quickly go to it by typing man function name and searching for the string return value.

Although errno has some drawbacks, there is good news as well.

There's extensive work underway in the GNU C library to save errno when a function is entered and then restore it to the original value if the function call succeeds. It looks like glibc tries very hard not to write over errno for successful function calls. However, the world is not GNU (yet), so portable code should not rely on this fact.

Also, although going through documentation every time you want to see what a particular error code means gets tiresome, there are two functions that make it easier to interpret error codes: perror() and strerror(). They do the same thing, but in different ways. The perror() function takes a string argument and has no return value:

#include <stdio.h>
void perror(const char *s);

The argument of perror() is a user-supplied string. When perror() is called, it prints this string, followed by a colon and space, and then a description of the type of error based on the value of errno. Here's a simple example of how to use perror():

Example Listing 7-3. perror-example.c

int main(void)
{
   FILE *fp;

   fp = fopen("/foo/bar", "r");

   if (fp == NULL)
      perror("I found an error");

   return 0;
}

If there's no file /foo/bar on your system, the output looks like this:

$ ./a.out
I found an error: No such file or directory

The output of perror() goes to standard error. Remember this if you want to redirect your program's error output to a file.

Another function that helps you translate errno codes into descriptive messages is strerror():

#include <string.h>

char *strerror(int errnum);

This function takes the value of errno as its argument and returns a string that describes the error. Here's an example of how to use strerror():

Example Listing 7-4. strerror-example.c

int main(void)
{
   close(5);
   printf("%s\n", strerror(errno));
   return 0;
}

Here is the output of this program:

$ ./a.out
Bad file descriptor

^[18] We opened the file in read mode and then tried to write to it.

^[19]For example, some systems differentiate between EWOULDBLOCK and EAGAIN, butGNU/Linux does not.

^[20]In addition to the libc info pages, you can look around in your system header files to examine the errno values. Not only is this a safe and natural thing to do, it's actually an encouraged practice!

^[21]Our use of errno in Example Listing 7-2 was not good practice.