2.5. Erasing Files Securely

Problem

You want to erase a file securely, preventing recovery of any data via "undelete" tools or any inspection of the disk for data that has been left behind.

Solution

Write over the data in the file multiple times, varying the data written each time. You should write both random and patterned data for maximum effectiveness.

Discussion

Warning

It is extremely difficult, if not outright impossible, to guarantee that the contents of a file are completely unrecoverable on modern operating systems that offer logging filesystems, virtual memory, and other such features.

Securely deleting files from disk is not as simple as issuing a system call to delete the file from the filesystem. The first problem is that most delete operations do not do anything to the data; they merely delete any underlying metadata that the filesystem uses to associate the file contents with the filename. The storage space where the actual data is stored is then marked free and will be reclaimed whenever the filesystem needs that space.

The result is that to truly erase the data, you need to overwrite it with nonsense before the filesystem delete operation is performed. Many times, this overwriting is implemented by simply zeroing all the bytes in the file. While this will certainly erase the file from the perspective of most conventional utilities, the fact that most data is stored on magnetic media makes this more complicated.

More sophisticated tools can analyze the actual media and reveal the data that was previously stored on it. This type of data recovery has a limit, however. If the data is sufficiently overwritten on the media, it does become unrecoverable, masked by the new data that has overwritten it. A variety of factors, such as the type of data written and the characteristics of the media, determine the point at which the interesting data becomes unrecoverable.

A technique developed by Peter Gutmann provides an algorithm involving multiple passes of data written to the disk to delete a file securely. The passes involve both specific patterns and random data written to the disk. The paper detailing this technique is available from http://www.cs.auckland.ac.nz/~pgut001/pubs/secure_del.html.

Unfortunately, many factors also work to thwart the feasibility of securely wiping the contents of a file. Many modern operating systems employ complex filesystems that may cause several copies of any given file to exist in some form at various different locations on the media. Other modern operating system features such as virtual memory often work to defeat the goal of securely obliterating any traces of sensitive data.

One of the worst things that can happen is that filesystem caching will turn multiple writes into a single write operation. On some platforms, calling fsync( ) on the file after one pass will generally cause the filesystem to flush the contents of the file to disk. But on some platforms that's not necessarily sufficient. Doing a better job requires knowing about the operating system on which your code is running. For example, you might be able to wait 10 minutes between passes, and ensure that the cached file has been written to disk at least once in that time frame. Below, we provide an implementation of Peter Gutmann's secure file-wiping algorithm, assuming fsync( ) is enough.

Tip

On Windows XP and Windows Server 2003, you can use the cipher command with the /w flag to securely wipe unused portions of NTFS filesystems.

We provide three functions:

spc_fd_wipe( ): Overwrites the contents of a file identified by the specified file descriptor in accordance with Gutmann's algorithm. If an error occurs while performing the wipe operation, the return value is -1; otherwise, a successful operation returns zero.
spc_file_wipe( ): A wrapper around the first function, which uses a FILE object instead of a file descriptor. If an error occurs while performing the wipe operation, the return value is -1; otherwise, a successful operation returns zero.
SpcWipeFile( ): A Windows-specific function that uses the Win32 API for file access. It requires an open file handle as its only argument and returns a boolean indicating success or failure.

Note that for all three functions, the file descriptor, FILE object, or file handle passed as an argument must be open with write access to the file to be wiped; otherwise, the wiping functions will fail. As written, these functions will probably not work very well on media other than disk because they are constantly seeking back to the beginning of the file. Another issue that may arise is filesystem caching. All the writes made to the file may not actually be written to the physical media.

#include <limits.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <errno.h>
#include <stdio.h>
#include <string.h>
   
#define SPC_WIPE_BUFSIZE 4096
   
static int write_data(int fd, const void *buf, size_t nbytes) {
  size_t  towrite, written = 0;
  ssize_t result;
   
  do {
    if (nbytes - written > SSIZE_MAX) towrite = SSIZE_MAX;
    else towrite = nbytes - written;
    if ((result = write(fd, (const char *)buf + written, towrite)) >= 0)
      written += result;
    else if (errno != EINTR) return 0;
  } while (written < nbytes);
  return 1;
}
   
static int random_pass(int fd, size_t nbytes)
{
  size_t        towrite;
  unsigned char buf[SPC_WIPE_BUFSIZE];
   
  if (lseek(fd, 0, SEEK_SET) != 0) return -1;
  while (nbytes > 0) {
    towrite = (nbytes > sizeof(buf) ? sizeof(buf) : nbytes);
    spc_rand(buf, towrite);
    if (!write_data(fd, buf, towrite)) return -1;
    nbytes -= towrite;
  }
  fsync(fd);
  return 0;
}
   
static int pattern_pass(int fd, unsigned char *buf, size_t bufsz, size_t filesz) {
  size_t towrite;
   
  if (!bufsz || lseek(fd, 0, SEEK_SET) != 0) return -1;
  while (filesz > 0) {
    towrite = (filesz > bufsz ? bufsz : filesz);
    if (!write_data(fd, buf, towrite)) return -1;
    filesz -= towrite;
  }
  fsync(fd);
  return 0;
}
   
int spc_fd_wipe(int fd) {
  int           count, i, pass, patternsz;
  struct stat   st;
  unsigned char buf[SPC_WIPE_BUFSIZE], *pattern;
   
  static unsigned char single_pats[16] = {
    0x00, 0x11, 0x22, 0x33, 0x44, 0x55, 0x66, 0x77,
    0x88, 0x99, 0xaa, 0xbb, 0xcc, 0xdd, 0xee, 0xff
  };
  static unsigned char triple_pats[6][3] = {
    { 0x92, 0x49, 0x24 }, { 0x49, 0x24, 0x92 }, { 0x24, 0x92, 0x49 },
    { 0x6d, 0xb6, 0xdb }, { 0xb6, 0xdb, 0x6d }, { 0xdb, 0x6d, 0xb6 }
  };
   
  if (fstat(fd, &st) =  = -1) return -1;
  if (!st.st_size) return 0;
   
  for (pass = 0;  pass < 4;  pass++)
    if (random_pass(fd, st.st_size) =  = -1) return -1;
   
  memset(buf, single_pats[5], sizeof(buf));
  if (pattern_pass(fd, buf, sizeof(buf), st.st_size) =  = -1) return -1;
  memset(buf, single_pats[10], sizeof(buf));
  if (pattern_pass(fd, buf, sizeof(buf), st.st_size) =  = -1) return -1;
   
  patternsz = sizeof(triple_pats[0]);
  for (pass = 0;  pass < 3;  pass++) {
    pattern = triple_pats[pass];
    count   = sizeof(buf) / patternsz;
    for (i = 0;  i < count;  i++)
      memcpy(buf + (i * patternsz), pattern, patternsz);
    if (pattern_pass(fd, buf, patternsz * count, st.st_size) =  = -1) return -1;
  }
   
  for (pass = 0;  pass < sizeof(single_pats);  pass++) {
    memset(buf, single_pats[pass], sizeof(buf));
    if (pattern_pass(fd, buf, sizeof(buf), st.st_size) =  = -1) return -1;
  }
   
  for (pass = 0;  pass < sizeof(triple_pats) / patternsz;  pass++) {
    pattern = triple_pats[pass];
    count   = sizeof(buf) / patternsz;
    for (i = 0;  i < count;  i++)
      memcpy(buf + (i * patternsz), pattern, patternsz);
    if (pattern_pass(fd, buf, patternsz * count, st.st_size) =  = -1) return -1;
  }
   
  for (pass = 0;  pass < 4;  pass++)
    if (random_pass(fd, st.st_size) =  = -1) return -1;
  return 0;
}
   
int spc_file_wipe(FILE *f) {
  return spc_fd_wipe(fileno(f));
}

The Unix implementations should work on Windows systems using the standard C runtime API; however, it is rare that the standard C runtime API is used on Windows. The following code implements SpcWipeFile( ), which is virtually identical to the standard C version except that it uses only Win32 APIs for file access.

#include <windows.h>
#include <wincrypt.h>
   
#define SPC_WIPE_BUFSIZE 4096
   
static BOOL RandomPass(HANDLE hFile, HCRYPTPROV hProvider, DWORD dwFileSize)
{
  BYTE  pbBuffer[SPC_WIPE_BUFSIZE];
  DWORD cbBuffer, cbTotalWritten, cbWritten;
   
  if (SetFilePointer(hFile, 0, 0, FILE_BEGIN) =  = 0xFFFFFFFF) return FALSE;
  while (dwFileSize > 0) {
    cbBuffer = (dwFileSize > sizeof(pbBuffer) ? sizeof(pbBuffer) : dwFileSize);
    if (!CryptGenRandom(hProvider, cbBuffer, pbBuffer)) return FALSE;
    for (cbTotalWritten = 0;  cbBuffer > 0;  cbTotalWritten += cbWritten)
      if (!WriteFile(hFile, pbBuffer + cbTotalWritten, cbBuffer - cbTotalWritten,
                     &cbWritten, 0)) return FALSE;
    dwFileSize -= cbTotalWritten;
  }
  return TRUE;
}
   
static BOOL PatternPass(HANDLE hFile, BYTE *pbBuffer, DWORD cbBuffer, DWORD dwFileSize) {
  DWORD cbTotalWritten, cbWrite, cbWritten;
   
  if (!cbBuffer || SetFilePointer(hFile, 0, 0, FILE_BEGIN) =  = 0xFFFFFFFF) return FALSE;
  while (dwFileSize > 0) {
    cbWrite = (dwFileSize > cbBuffer ? cbBuffer : dwFileSize);
    for (cbTotalWritten = 0;  cbWrite > 0;  cbTotalWritten += cbWritten)
      if (!WriteFile(hFile, pbBuffer + cbTotalWritten, cbWrite - cbTotalWritten,
                     &cbWritten, 0)) return FALSE;
    dwFileSize -= cbTotalWritten;
  }
  return TRUE;
}
   
BOOL SpcWipeFile(HANDLE hFile) {
  BYTE       pbBuffer[SPC_WIPE_BUFSIZE];
  DWORD      dwCount, dwFileSize, dwIndex, dwPass;
  HCRYPTPROV hProvider;
   
  static BYTE  pbSinglePats[16] = {
    0x00, 0x11, 0x22, 0x33, 0x44, 0x55, 0x66, 0x77,
    0x88, 0x99, 0xaa, 0xbb, 0xcc, 0xdd, 0xee, 0xff
  };
  static BYTE  pbTriplePats[6][3] = {
    { 0x92, 0x49, 0x24 }, { 0x49, 0x24, 0x92 }, { 0x24, 0x92, 0x49 },
    { 0x6d, 0xb6, 0xdb }, { 0xb6, 0xdb, 0x6d }, { 0xdb, 0x6d, 0xb6 }
  };
  static DWORD cbPattern = sizeof(pbTriplePats[0]);
   
  if ((dwFileSize = GetFileSize(hFile, 0)) =  = INVALID_FILE_SIZE) return FALSE;
  if (!dwFileSize) return TRUE;
   
  if (!CryptAcquireContext(&hProvider, 0, 0, 0, CRYPT_VERIFYCONTEXT))
    return FALSE;
   
  for (dwPass = 0;  dwPass < 4;  dwPass++)
    if (!RandomPass(hFile, hProvider, dwFileSize)) {
      CryptReleaseContext(hProvider, 0);
      return FALSE;
    }
   
  memset(pbBuffer, pbSinglePats[5], sizeof(pbBuffer));
  if (!PatternPass(hFile, pbBuffer, sizeof(pbBuffer), dwFileSize)) {
    CryptReleaseContext(hProvider, 0);
    return FALSE;
  }
  memset(pbBuffer, pbSinglePats[10], sizeof(pbBuffer));
  if (!PatternPass(hFile, pbBuffer, sizeof(pbBuffer), dwFileSize)) {
    CryptReleaseContext(hProvider, 0);
    return FALSE;
  }
   
  cbPattern = sizeof(pbTriplePats[0]);
  for (dwPass = 0;  dwPass < 3;  dwPass++) {
    dwCount   = sizeof(pbBuffer) / cbPattern;
    for (dwIndex = 0;  dwIndex < dwCount;  dwIndex++)
      CopyMemory(pbBuffer + (dwIndex * cbPattern), pbTriplePats[dwPass],
                  cbPattern);
    if (!PatternPass(hFile, pbBuffer, cbPattern * dwCount, dwFileSize)) {
      CryptReleaseContext(hProvider, 0);
      return FALSE;
    }
  }
   
  for (dwPass = 0;  dwPass < sizeof(pbSinglePats);  dwPass++) {
    memset(pbBuffer, pbSinglePats[dwPass], sizeof(pbBuffer));
    if (!PatternPass(hFile, pbBuffer, sizeof(pbBuffer), dwFileSize)) {
      CryptReleaseContext(hProvider, 0);
      return FALSE;
    }
  }
   
  for (dwPass = 0;  dwPass < sizeof(pbTriplePats) / cbPattern;  dwPass++) {
    dwCount   = sizeof(pbBuffer) / cbPattern;
    for (dwIndex = 0;  dwIndex < dwCount;  dwIndex++)
      CopyMemory(pbBuffer + (dwIndex * cbPattern), pbTriplePats[dwPass],
                  cbPattern);
    if (!PatternPass(hFile, pbBuffer, cbPattern * dwCount, dwFileSize)) {
      CryptReleaseContext(hProvider, 0);
      return FALSE;
    }
  }
   
  for (dwPass = 0;  dwPass < 4;  dwPass++)
    if (!RandomPass(hFile, hProvider, dwFileSize)) {
      CryptReleaseContext(hProvider, 0);
      return FALSE;
    }
   
  CryptReleaseContext(hProvider, 0);
  return TRUE;
}