31_9781118737637-ch23.html

Chapter 23

File Management

In This Chapter

Reading files from a directory

Checking file types

Working with the directory hierarchy

Changing filenames

Duplicating files

Removing a file

The C language library features many functions that interface directly with the operating system, allowing you to peek, poke, and prod into the very essence of files themselves. You never know when you’ll need to plow through a directory, rename a file, or delete a temporary file that the program created. It’s powerful stuff, but such file management is well within the abilities of your C programs.

Directory Madness

A directory is nothing more than a database of files stored on a device’s mass storage system. Also called a folder, a directory contains a list of files plus any subdirectories. Just as you can manipulate a file, a directory can be opened, read, and then closed. And as with the directory listing you see on a computer screen, you can gather information about the various files, their sizes, types, and more.

Calling up a directory

The C library's opendir() function examines the contents of a specific directory. It works similarly to the fopen() function. Here's the format:

dhandle = opendir(pathname);

dhandle is a pointer of the DIR type, similar to a file handle being of the FILE type. The pathname is the name of a directory to examine. It can be a full path, or you can use the . (dot) abbreviation for the current directory or .. (dot-dot) for the parent directory.

Once a directory is open, the readdir() function fetches records from its database, similar to the fread() function, although the records describe files stored in the directory. Here's the readdir() function's format:

*entry = readdir(dhandle);

entry is a pointer to a dirent structure. After a successful call to readdir(), the structure is filled with information about a file in the directory. Every time readdir() is called, it points to the next file entry, like reading records from a database. When the function returns NULL, the last file in the directory has been read.

Finally, after the program is done messing around, the directory must be closed. This operation is handled by the closedir() function:

closedir(dhandle);

All these directory functions require the dirent.h header file to be included with your source code.

Listing 23-1 illustrates code that reads a single entry from the current directory. The variables that are required are declared in Lines 7 and 8: folder is a DIR pointer, used as the handle to represent the directory that's opened. file is the memory location of a structure that holds information about individual files in the directory.

Listing 23-1: Pluck a File from the Directory

#include <stdio.h>

#include <stdlib.h>

#include <dirent.h>

int main()

{

DIR *folder;

struct dirent *file;

folder=opendir(".");

if(folder==NULL)

{

puts("Unable to read directory");

exit(1);

}

file = readdir(folder);

printf("Found the file '%s'\n",file->d_name);

closedir(folder);

return(0);

}

The directory is opened at Line 10; the single dot is an abbreviation for the current directory. Lines 11 through 15 handle any errors, similar to opening any file. (Refer to Chapter 22.)

The first entry in the directory is read at Line 16, and then Line 17 displays the information. The d_name element in the dirent structure represents the file's name.

Finally, at Line 18, the directory is closed.

Exercise 23-1: Create a new project by using the source code from Listing 23-1. Build and run.

Of course, the first file that’s most likely to be read in a directory is the directory itself, the dot entry. Boring!

Exercise 23-2: Modify the source code shown in Listing 23-1 so that the entire directory is read. A while loop can handle the job. Refer to Listing 22-9 (from Chapter 22) if you find yourself needing inspiration on how to build the loop.

The readdir() function returns NULL after the last file entry has been read from a directory.

Gathering more file info

The stat() function reads various and sundry information about a file based on the file's name. Use stat() to determine a file's date, size, type, and other trivia. The function's format looks like this:

stat(filename,stat);

filename is a string value representing the file to examine. stat is the address of a stat structure. After a successful call to the stat() function, the stat structure is filled with information about the file. And I wholly agree that calling both the function and the structure stat leads to an undue amount of consternation.

You need to include the sys/stat.h header file in your code to make the compiler pleased with the stat() function.

Listing 23-2 demonstrates how the stat() function can be incorporated into a directory listing. It starts with the inclusion of the sys/stat.h header file at Line 5. The sys/ part simply tells the compiler in which directory to locate the stat.h file. (sys is a subdirectory of include.)

Listing 23-2: A More Impressive File Listing

#include <stdio.h>

#include <stdlib.h>

#include <dirent.h>

#include <time.h>

#include <sys/stat.h>

int main()

{

DIR *folder;

struct dirent *file;

struct stat filestat;

folder=opendir(".");

if(folder==NULL)

{

puts("Unable to read directory");

exit(1);

}

while(file = readdir(folder))

{

stat(file->d_name,&filestat);

printf("%-14s %5ld %s",

file->d_name,

(long)filestat.st_size,

ctime(&filestat.st_mtime));

}

closedir(folder);

return(0);

}

Line 11 creates a stat structure variable named filestat. That structure is filled at Line 21 for each file found in the directory; the file->d_name element provides the filename, and the address of the filestat structure is provided to the stat() function.

The printf() function starting at Line 22 displays the information revealed by the stat() function: Line 23 displays the file's name; Line 24 pulls the file's size from the filestat structure; and in Line 25, the ctime() function extract's the file's modification time from the filestat structure's st_mtime element. That time value is kept using the Unix epoch. (See Chapter 21 for more information about time programming in C.)

Oh! And the printf() statement lacks a \n (newline) because the ctime() function's output provides one.

I've typecast the filestat.st_size variable at Line 24 to a long int value. The printf() function otherwise balks at displaying the st_size value, claiming that it's of the off_t variable type. The printf() function lacks a conversion character for the off_t type, so I've typecast it to prevent the warning error. That's a big assumption on my part, considering that off_t could be another type of variable in the future or even on another system.

Exercise 23-3: Type the source code from Listing 23-2 into your editor or just modify your solution from Exercise 23-2. Build and run to see a better directory listing.

Separating files from directories

Each file stored in a directory is classified by a file type. For example, some entries in a directory listing are subdirectories. Other entries may be symbolic links or sockets. To determine which file is of which type, use the stat() function. The st_mode element in the stat structure can be examined to determine the file type. That's good news.

The st_mode element is a bit field — various bits in that value are set depending on the various file type attributes applied to a file. But that's not entirely bad news because C features macros that can help you quickly determine a file type.

For example, the S_ISDIR macro returns TRUE when a file's st_mode element indicates a directory, not a regular file. Use the S_ISDIR macro like this:

S_ISDIR(filestat.st_mode)

This condition is evaluated as TRUE for a directory and FALSE otherwise.

Exercise 23-4: Modify your solution to Exercise 23-3 so that any subdirectories listed are flagged as such. Because directories don't generally have file sizes, specify the text <DIR> in the file size field for the program's output.

If the current directory lacks subdirectories, change the directory name in Line 13.

In Windows, use two backslashes when typing a path. For example:

dhandle = opendir("\\Users\\Dan");

Windows uses the backslash as a pathname separator. C uses the backslash as an escape character in a string. To specify a single backslash, you must specify two of them.

Exploring the directory tree

Most storage media feature more than one directory. The main directory is the root, but often subdirectories fill the media. Using C, you can create directories of your own and flit between them like bees upon flowers. The C library sports various functions to sate your directory-diving desires. Here’s a sampling:

`getcwd()`	Retrieve the current working directory
`mkdir()`	Create a new directory
`chdir()`	Change to the directory specified
`rmdir()`	Obliterate the directory specified

getcwd(), chdir(), and rmdir() require the unistd.h header file; the mkdir() function requires sys/stat.h.

Listing 23-3 makes use of three directory functions: getcwd(), mkdir(), and chdir().

Listing 23-3: Make Me a Directory

#include <stdio.h>

#include <unistd.h>

#include <sys/stat.h>

int main()

{

char curdir[255];

getcwd(curdir,255);

printf("Current directory is %s\n",curdir);

mkdir("very_temporary",755);

puts("New directory created.");

chdir("very_temporary");

getcwd(curdir,255);

printf("Current directory is %s\n",curdir);

return(0);

}

Line 7 sets aside space for storing the current directory's pathname. I'm plucking the value 255 out of thin air; it should be large enough. Serious programmers should use a constant defined for their systems. For example, PATH_MAX defined in the sys/syslimit.h header file would be perfect, but it's not available on all systems. You could use the FILENAME_MAX constant (defined in stdio.h), but it sets the size for a filename, not a full pathname. As a compromise, I choose 255.

The getcwd() function in Line 9 captures the current directory's name and saves it in the curdir array. That directory name — a full pathname — is displayed on Line 10.

Line 11 creates a new directory, very_temporary. The value 755 is the file-creation mode, used on the Mac and Unix systems to set permissions (à la the chmod command). If you have a Windows system, you need to omit that argument and use the following for Line 11:

mkdir("very_temporary");

After the directory is created, the chdir() function on Line 13 changes to that directory, followed by the getcwd() function at Line 14 capturing its full pathname.

Exercise 23-5: Copy the source code from Listing 23-3 into your editor. Remember to omit the second argument for mkdir() at Line 11 if you're compiling on Windows. Build and run the program.

The end result of Exercise 23-5 is a new directory, very_temporary, created in whichever directory the program was run. Feel free to remove that directory using your computer operating system's directory-obliteration command.

Both chdir() and mkdir() have a return value, an int. When the value is 0, the function completed its operation successfully. Otherwise, a value of -1 is returned.

Exercise 23-6: Modify the source code from Listing 23-3 so that error checking is performed on the chdir() and mkdir() functions. If an error occurs, the functions return the value –1. Based on that, the code should display an appropriate message and terminate the program.

Fun with Files

The C library offers functions for making a new file, writing to that file, and reading data from any file. To bolster those basic file functions are a suite of file manipulation functions. They allow your programs to rename, copy, and delete files. The functions work on any file, not just those you create, so be careful!

Renaming a file

The rename() function is not only appropriately named but it's also pretty simple to figure out:

x = rename(oldname,newname);

oldname is the name of a file already present; newname is the file's new name. Both values can be immediate or variables. The return value is 0 upon success; -1 otherwise.

The rename() function is prototyped in the stdio.h header file.

The source code shown in Listing 23-4 creates a file named blorfus and then renames that file to wambooli.

Listing 23-4: Creating and Renaming a File

#include <stdio.h>

#include <stdlib.h>

int main()

{

FILE *test;

test=fopen("blorfus","w");

if(!test)

{

puts("Unable to create file");

exit(1);

}

fclose(test);

puts("File created");

if(rename("blorfus","wambooli") == -1)

{

puts("Unable to rename file");

exit(1);

}

puts("File renamed");

return(0);

}

Lines 9 through 15 create the file blorfus. The file is empty; nothing is written to it.

The rename() function at Line 17 renames the file. The return value is compared with -1 in Line 18 to see whether the operation was successful.

Exercise 23-7: Create a new program by using the source code shown in Listing 23-4. Build and run.

The renamed file, wambooli, is used in a later section as an example.

Copying a file

The C library features no function that duplicates a file. Instead, you have to craft your own: Write code that reads in a file, one chunk at a time, and then writes that chunk out to a duplicate file. That’s how files are copied.

Listing 23-5 demonstrates how a file can be duplicated, or copied. The two files are specified in Lines 9 and 10. In fact, Line 9 uses the name of the Exercise file, the source code from Listing 23-5. The destination file, which contains the copy, is simply the same filename, but with a bak extension.

Listing 23-5: Duplicate That File

#include <stdio.h>

#include <stdlib.h>

int main()

{

FILE *original,*copy;

int c;

original=fopen("ex2308.c","r");

copy=fopen("ex2308.bak","w");

if( !original || !copy)

{

puts("File error!");

exit(1);

}

while( (c=fgetc(original)) != EOF)

fputc(c,copy);

puts("File duplicated");

return(0);

}

The copying work is done by the while loop at Line 16. One character is read by the fgetc() function, and it's immediately copied to the destination by the fputc() function in Line 17. The loop keeps spinning until the EOF, or end-of-file, is encountered.

Exercise 23-8: Copy the source code form Listing 23-5 into your editor. Save the file as ex2308.c (which is this book's file-naming convention), build, and run. You'll need to use your computer operating system to view the resulting file in a folder window. Or, for extra For Dummies bonus points, you can view the results in a terminal or command prompt window.

Deleting a file

Programs delete files all the time, although the files are mostly temporary anyway. Back in the bad old days, I remember complaining about programs that didn't "clean up their mess." If your code creates temporary files, remember to remove them before the program quits. The way to do that is via the unlink() function.

Yes, the function is named unlink and not delete or remove or erase or whatever operating system command you're otherwise used to. In Unix, the unlink command can be used in the terminal window to zap files, although the rm command is more popular.

The unlink() function requires the presence of the unistd.h header file, which you see at Line 3 in Listing 23-6.

Listing 23-6: File Be Gone!

#include <stdio.h>

#include <stdlib.h>

#include <unistd.h>

int main()

{

if(unlink("wambooli") == -1)

{

puts("I just can't kill that file");

exit(1);

}

puts("File killed");

return(0);

}

The file slated for death is listed in Line 9 as the unlink() function's only argument. It's the wambooli file, created back in Exercise 23-7! So if you don't have that file, go back and work Exercise 23-7. (In Code::Blocks, you'll also need to copy that file into the proper folder for your solution to Exercise 23-9.)

Exercise 23-9: Type the source code from Listing 23-6 into your editor. Build and run.