When Files Go Bad: Dealing with Exceptions

The usual suspects you might get from any method: incorrect parameters, null references, and so on
I/O-related problems
Security-related problems

The first category can, of course, be dealt with as normal—if they occur (as we discussed in Chapter 6) there is usually some bug or unexpected usage that you need to deal with.

The other two are slightly more interesting cases. We should expect problems with file I/O. Files and directories are (mostly) system-wide shared resources. This means that anyone can be doing something with them while you are trying to use them. As fast as you’re creating them, some other process might be deleting them. Or writing to them; or locking them so that you can’t touch them; or altering the permissions on them so that you can’t see them anymore. You might be working with files on a network share, in which case different computers may be messing with the files, or you might lose connectivity partway through working with a file.

This “global” nature of files also means that you have to deal with concurrency problems. Consider this piece of code, for example, that makes use of the (almost totally redundant) method File.Exists, shown in Example 11-23, which determines whether a file exists.

Example 11-23. The questionable File.Exists method

if (File.Exists("SomeFile.txt"))
{
    // Play with the file
}

Is it safe to play with the file in there, on the assumption that it exists?

No.

In another process, even from another machine if the directory is shared, someone could nip in and delete the file or lock it, or do something even more nefarious (like substitute it for something else). Or the user might have closed the lid of his laptop just after the method returns, and may well be in a different continent by the time he brings it out of sleep mode, at which point you won’t necessarily have access to the same network shares that seemed to be visible just one line of code ago.

So you have to code extremely defensively, and expect exceptions in your I/O code, even if you checked that everything looked OK before you started your work.

Unlike most exceptions, though, abandoning the operation is not always the best choice. You often see transient problems, like a USB drive being temporarily unavailable, for example, or a network glitch temporarily hiding a share from us, or aborting a file copy operation. (Transient network problems are particularly common after a laptop resumes from suspend—it can take a few seconds to get back on the network, or maybe even minutes if the user is in a hotel and has to sign up for an Internet connection before connecting back to the office VPN. Abandoning the user’s data is not a user-friendly response to this situation.)

When an I/O problem occurs, the framework throws one of several exceptions derived from IOException (or, as we’ve already seen, IOException itself) listed here:

IOException: This is thrown when some general problem with I/O has occurred. This is the base for all of the more specific exception types, but it is sometimes thrown in its own right, with the Message text describing the actual problem. This makes it somewhat less useful for programmatic interpretation; you usually have to allow the user to intervene in some way when you catch one of these.
DirectoryNotFoundException: This is thrown when an attempt is made to access a directory that does not exist. This commonly occurs because of an error in constructing a path (particularly when relative paths are in play), or because some other process has moved or deleted a directory during an operation.
DriveNotFoundException: This is thrown when the root drive in a path is no longer available. This could be because a drive letter has been mapped to a network location which is no longer available, or a removable device has been removed. Or because you typed the wrong drive letter!
FileLoadException: This is a bit of an anomaly in the family of IOExceptions, and we’re including it in this list only because it can cause some confusion. It is thrown by the runtime when an assembly cannot be loaded; as such, it has more to do with assemblies than files and streams.
FileNotFoundException: This is thrown when an attempt is made to access a file that does not exist. As with DirectoryNotFoundException, this is often because there has been some error in constructing a path (absolute or relative), or because something was moved or deleted while the program was running.
PathTooLongException: This is an awkward little exception, and causes a good deal of confusion for developers (which is one reason correct behavior in the face of long paths is a part of Microsoft’s Designed For Windows test suite). It is thrown when a path provided is too long. But what is “too long”? The maximum length for a path in Windows used to be 260 characters (which isn’t very long at all). Recent versions allow paths up to about (but not necessarily exactly) 32,767 characters, but making use of that from .NET is awkward. There’s a detailed discussion of Windows File and Path lengths if you fall foul of the problem in the MSDN documentation at http://msdn.microsoft.com/library/aa365247, and a discussion of the .NET-specific issues at http://go.microsoft.com/fwlink/?LinkID=163666.

If you are doing anything with I/O operations, you will need to think about most, if not all, of these exceptions, deciding where to catch them and what to do when they occur.

Let’s look back at our example again, and see what we want to do with any exceptions that might occur. As a first pass, we could just wrap our main loop in a try/catch block, as Example 11-24 does. Since our application’s only job is to report its findings, we’ll just display a message if we encounter a problem.

Example 11-24. A first attempt at handling I/O exceptions

try
{
    List<FileNameGroup> filesGroupedByName =
        InspectDirectories(recurseIntoSubdirectories, directoriesToSearch);

    DisplayMatches(foundFiles);
    Console.ReadKey();
}
catch (PathTooLongException ptlx)
{
    Console.WriteLine("The specified path was too long");
    Console.WriteLine(ptlx.Message);
}
catch (DirectoryNotFoundException dnfx)
{
    Console.WriteLine("The specified directory was not found");
    Console.WriteLine(dnfx.Message);
}
catch (IOException iox)
{
    Console.WriteLine(iox.Message);
}
catch (UnauthorizedAccessException uax)
{
    Console.WriteLine("You do not have permission to access this directory.");
    Console.WriteLine(uax.Message);
}
catch (ArgumentException ax)
{
    Console.WriteLine("The path provided was not valid.");
    Console.WriteLine(ax.Message);
}
finally
{
    if (testDirectoriesMade)
    {
        CleanupTestDirectories(directoriesToSearch);
    }
}

We’ve decided to provide specialized handling for the PathTooLongException and DirectoryNotFoundException exceptions, as well as generic handling for IOException (which, of course, we have to catch after the exceptions derived from it).

In addition to those IOException-derived types, we’ve also caught UnauthorizedAccessException. This is a security exception, rather than an I/O exception, and so it derives from a different base (SystemException). It is thrown if the user does not have permission to access the directory concerned.

Let’s see that in operation, by creating an additional test directory and denying ourselves access to it. Example 11-25 shows a function to create a directory where we deny ourselves the ListDirectory permission.

Example 11-25. Denying permission

private static string CreateDeniedDirectory(string parentPath)
{
    string deniedDirectory = Path.GetRandomFileName();
    string fullDeniedPath = Path.Combine(parentPath, deniedDirectory);
    string userName = WindowsIdentity.GetCurrent().Name;
    DirectorySecurity ds = new DirectorySecurity();
    FileSystemAccessRule fsarDeny =
        new FileSystemAccessRule(
            userName,
            FileSystemRights.ListDirectory,
            AccessControlType.Deny);
    ds.AddAccessRule(fsarDeny);

    Directory.CreateDirectory(fullDeniedPath, ds);
    return fullDeniedPath;
}

We can call it from our MakeTestDirectories method, as Example 11-26 shows (along with suitable modifications to the code to accommodate the extra directory).

Example 11-26. Modifying MakeTestDirectories for permissions test

private static string[] MakeTestDirectories()
{
    // ...
    // Let's make three test directories
    // and leave space for a fourth to test access denied behavior
    var directories = new string[4];
    for (int i = 0; i < directories.Length - 1; ++i)
    {
        ... as before ...
    }

    CreateTestFiles(directories.Take(3));

    directories[3] = CreateDeniedDirectory(localApplicationData);

    return directories;
}

But hold on a moment, before you build and run this. If we’ve denied ourselves permission to look at that directory, how are we going to delete it again in our cleanup code? Fortunately, because we own the directory that we created, we can modify the permissions again when we clean up.

Finding and Modifying Permissions

Example 11-27 shows a method which can give us back full control over any directory (providing we have the permission to change the permissions). This code makes some assumptions about the existing permissions, but that’s OK here because we created the directory in the first place.

Example 11-27. Granting access to a directory

private static void AllowAccess(string directory)
{
    DirectorySecurity ds = Directory.GetAccessControl(directory);

    string userName = WindowsIdentity.GetCurrent().Name;

    // Remove the deny rule
    FileSystemAccessRule fsarDeny =
        new FileSystemAccessRule(
            userName,
            FileSystemRights.ListDirectory,
            AccessControlType.Deny);
    ds.RemoveAccessRuleSpecific(fsarDeny);

    // And add an allow rule
    FileSystemAccessRule fsarAllow =
        new FileSystemAccessRule(
            userName,
            FileSystemRights.FullControl,
            AccessControlType.Allow);
    ds.AddAccessRule(fsarAllow);

    Directory.SetAccessControl(directory, ds);
}

Notice how we’re using the GetAccessControl method on Directory to get hold of the directory security information. We then construct a filesystem access rule which matches the deny rule we created earlier, and call RemoveAccessRuleSpecific on the DirectorySecurity information we retrieved. This matches the rule up exactly, and then removes it if it exists (or does nothing if it doesn’t).

Finally, we add an allow rule to the set to give us full control over the directory, and then call the Directory.SetAccessControl method to set those permissions on the directory itself.

Let’s call that method from our cleanup code, compile, and run. (Don’t forget, we’re deleting files and directories, and changing permissions, so take care!)

Here’s some sample output:

C:\Users\mwa\AppData\Local\ufmnho4z.h5p
C:\Users\mwa\AppData\Local\5chw4maf.xyu
C:\Users\mwa\AppData\Local\s1ydovhu.0wk
You do not have permission to access this directory.
Access to the path 'C:\Users\mwa\AppData\Local\byjijkza.3cj\' is denied.

These methods make it relatively easy to manage permissions when you create and manipulate files, but they don’t make it easy to decide what those permissions should be! It is always tempting just to make everything available to anyone—you can get your code compiled and “working” much quicker that way; but only for “not very secure” values of “working,” and that’s something that has to be of concern for every developer.

Warning

Your application could be the one that miscreants decide to exploit to turn your users’ PCs to the dark side.

I warmly recommend that you crank UAC up to the maximum (and put up with the occasional security dialog), run Visual Studio as a nonadministrator (as far as is possible), and think at every stage about the least possible privileges you can grant to your users that will still let them get their work done. Making your app more secure benefits everyone: not just your own users, but everyone who doesn’t receive a spam email or a hack attempt because the bad guys couldn’t exploit your application.

We’ve now handled the exception nicely—but is stopping really the best thing we could have done? Would it not be better to log the fact that we were unable to access particular directories, and carry on? Similarly, if we get a DirectoryNotFoundException or FileNotFoundException, wouldn’t we want to just carry on in this case? The fact that someone has deleted the directory from underneath us shouldn’t matter to us.

If we look again at our sample, it might be better to catch the DirectoryNotFoundException and FileNotFoundException inside the InspectDirectories method to provide a more fine-grained response to errors. Also, if we look at the documentation for FileInfo, we’ll see that it may actually throw a base IOException under some circumstances, so we should catch that here, too. And in all cases, we need to catch the security exceptions.

We’re relying on LINQ to iterate through the files and folders, which means it’s not entirely obvious where to put the exception handling. Example 11-28 shows the code from InspectDirectories that iterates through the folders, to get a list of files. We can’t put exception handling code into the middle of that query.

Example 11-28. Iterating through the directories

var allFilePaths = from directory in directoriesToSearch
                   from file in Directory.GetFiles(directory, "*.*",
                                                   searchOption)
                   select file;

However, we don’t have to. The simplest way to solve this is to put the code that gets the directories into a separate method, so we can add exception handling, as Example 11-29 shows.

Example 11-29. Putting exception handling in a helper method

private static IEnumerable<string> GetDirectoryFiles(
    string directory, SearchOption searchOption)
{
    try
    {
        return Directory.GetFiles(directory, "*.*", searchOption);
    }
    catch (DirectoryNotFoundException dnfx)
    {
        Console.WriteLine("Warning: The specified directory was not found");
        Console.WriteLine(dnfx.Message);
    }
    catch (UnauthorizedAccessException uax)
    {
        Console.WriteLine(
            "Warning: You do not have permission to access this directory.");
        Console.WriteLine(uax.Message);
    }

    return Enumerable.Empty<string>();
}

This method defers to Directory.GetFiles, but in the event of one of the expected errors, it displays a warning, and then just returns an empty collection.

Note

There’s a problem here when we ask GetFiles to search recursively: if it encounters a problem with even just one directory, the whole operation throws, and you’ll end up not looking in any directories. So while Example 11-29 makes a difference only when the user passes multiple directories on the command line, it’s not all that useful when using the /sub option. If you wanted to make your error handling more fine-grained still, you could write your own recursive directory search. The GetAllFilesInDirectory example in Chapter 7 shows how to do that.

If we modify the LINQ query to use this, as shown in Example 11-30, the overall progress will be undisturbed by the error handling.

Example 11-30. Iterating in the face of errors

var allFilePaths = from directory in directoriesToSearch
                   from file in GetDirectoryFiles(directory,
                                                  searchOption)
                   select file;

And we can use a similar technique for the LINQ query that populates the fileNameGroups—it uses FileInfo, and we need to handle exceptions for that. Example 11-31 iterates through a list of paths, and returns details for each file that it was able to access successfully, displaying errors otherwise.

Example 11-31. Handling exceptions from FileInfo

private static IEnumerable<FileDetails> GetDetails(IEnumerable<string> paths)
{
    foreach (string filePath in paths)
    {
        FileDetails details = null;
        try
        {
            FileInfo info = new FileInfo(filePath);
            details = new FileDetails
            {
                FilePath = filePath,
                FileSize = info.Length
            };
        }
        catch (FileNotFoundException fnfx)
        {
            Console.WriteLine("Warning: The specified file was not found");
            Console.WriteLine(fnfx.Message);
        }
        catch (IOException iox)
        {
            Console.Write("Warning: ");
            Console.WriteLine(iox.Message);
        }
        catch (UnauthorizedAccessException uax)
        {
            Console.WriteLine(
                "Warning: You do not have permission to access this file.");
            Console.WriteLine(uax.Message);
        }

        if (details != null)
        {
            yield return details;
        }
    }
}

We can use this from the final LINQ query in InspectDirectories. Example 11-32 shows the modified query.

Example 11-32. Getting details while tolerating errors

var fileNameGroups = from filePath in allFilePaths
                     let fileNameWithoutPath = Path.GetFileName(filePath)
                     group filePath by fileNameWithoutPath into nameGroup
                     select new FileNameGroup
                     {
                         FileNameWithoutPath = nameGroup.Key,
                         FilesWithThisName = GetDetails(nameGroup).ToList()
                     };

Again, this enables the query to process all accessible items, while reporting errors for any problematic files without having to stop completely. If we compile and run again, we see the following output:

C:\Users\mwa\AppData\Local\dcyx0fv1.hv3
C:\Users\mwa\AppData\Local\0nf2wqwr.y3s
C:\Users\mwa\AppData\Local\kfilxte4.exy
Warning: You do not have permission to access this directory.
Access to the path 'C:\Users\mwa\AppData\Local\r2gl4q1a.ycp\' is denied.
SameNameAndContent.txt
----------------------
C:\Users\mwa\AppData\Local\dcyx0fv1.hv3
C:\Users\mwa\AppData\Local\0nf2wqwr.y3s
C:\Users\mwa\AppData\Local\kfilxte4.exy

We’ve dealt cleanly with the directory to which we did not have access, and have continued with the job to a successful conclusion.

Now that we’ve found a few candidate files that may (or may not) be the same, can we actually check to see that they are, in fact, identical, rather than just coincidentally having the same name and length?