Opening a File

When you want to write a program that opens and reads a file, that program needs to tell Python where that file is. By default, Python assumes that the file you want to read is in the same directory as the program that is doing the reading. If you’re working in IDLE as you read this book, there’s a little setup you should do:

  1. Make a directory, perhaps called file_examples.

  2. In IDLE, select FileNew Window and type (or copy and paste) the following:

     First line of text
     Second line of text
     Third line of text
  3. Save this file in your file_examples directory under the name file_example.txt.

  4. In IDLE, select FileNew Window and type (or copy and paste) this program:

     file = open(​'file_example.txt'​, ​'r'​)
     contents = file.read()
     file.close()
     print​(contents)
  5. Save this as file_reader.py in your file_examples directory.

When you run this program, this is what gets printed:

 First line of text
 Second line of text
 Third line of text

It’s important that you save the two files in the same directory, as you’ll see in the next section. Also, this won’t work if you try those same commands from the Python shell.

Built-in function open opens a file (much like you open a book when you want to read it) and returns an object that knows how to get information from the file. This object also keeps track of how much you’ve read and which part of the file you’re about to read next. The marker that keeps track of the current location in the file is called a file cursor and acts much like a bookmark. The file cursor is initially at the beginning of the file, but as we read or write data it moves to the end of what we just read or wrote.

The first argument in the example call on function open, ’file_example.txt’, is the name of the file to open, and the second argument, ’r’, tells Python that you want to read the file; this is called the file mode. Other options for the mode include ’w’ for writing and ’a’ for appending, which you’ll see later in this chapter. If you call open with only the name of the file (omitting the mode), then the default is ’r’.

The second statement, contents = file.read(), tells Python that you want to read the contents of the entire file into a string, which we assign to a variable called contents.

The third statement, file.close(), releases all resources associated with the open file object.

The last statement prints the string.

When you run the program, you’ll see that newline characters are treated just like every other character; a newline character is just another character in the file.

The with Statement

Here’s a common programming pattern: get access to a resource, do something with the resource, and then tidy up and release the resource. In the previous file example, we gained access to a file by calling function open, then we read the file contents, and then we tidied up by closing the file.

There’s a catch: if there is a problem and an error occurs, it’s possible that our code has an error preventing execution of the statement file.close(), and the associated resources are never released. Python provides a with statement for situations like this where we always want to tidy up, regardless of whether an error occurs. For this reason, the with statement is frequently used for file access.

Here is the same example using a with statement:

 with​ open(​'file_example.txt'​, ​'r'​) ​as​ file:
  contents = file.read()
 
 print​(contents)

The general form of a with statement is as follows:

 with​ expression ​as​ variable:
  block

How Files Are Organized on Your Computer

A file path specifies a location in your computer’s file system. A file path contains the sequence of directories to a file, starting at the root directory at the top of the file system, and optionally includes the name of a file.

Here is an example of the file path for file_example.txt:

/Users/pgries/Desktop/file_examples/file_example.txt

This file path is on a computer running Apple OS X. A file path in Linux would look similar. Both operating systems use a forward slash as the directory separator.

In Microsoft Windows, the path usually begins with a drive letter, such as C:. There is one drive letter per disk partition. Also, Microsoft Windows uses a backslash as the directory separator. (When working with backslashes as directory separators, you might want to review Using Special Characters in Strings.)

Here is a path in Windows:

C:\Users\pgries\Desktop\file_examples\file_example.txt

If you always use forward slashes, Python’s file-handling operations will automatically translate them to work in Windows, much like these operations automatically translate the two kinds of newlines that you learned about in Normalizing Line Endings.

Specifying Which File You Want

Python keeps track of the current working directory; this is the directory in which it looks for files. When you run a Python program, the current working directory is the directory where that program is saved. For example, perhaps this is the path of the file that you have open in IDLE:

/home/pgries/Documents/py3book/Book/code/fileproc/program.py

Then this is the current working directory:

/home/pgries/Documents/py3book/Book/code/fileproc

When you call function open, it looks for the specified file in the current working directory.

The default current working directory for the Python shell is operating system dependent. You can find out the current working directory using function getcwd from module os:

 >>>​​ ​​import​​ ​​os
 >>>​​ ​​os.getcwd()
 '/home/pgries'

If you want to open a file in a different directory, you need to say where that file is. You can do that with an absolute path or with a relative path. An absolute path (like all the previous examples) is one that starts at the root of the file system, and a relative path is relative to the current working directory. Alternatively, you can change Python’s current working directory to a different directory using function chdir (short for “change directory”):

 >>>​​ ​​os.chdir(​​'/home/pgries/Documents/py3book'​​)
 >>>​​ ​​os.getcwd()
 '/home/pgries/Documents/py3book'

Let’s say that you have a program called reader.py and a directory called data in the same directory as reader.py. Inside data you might have files called data1.txt and data2.txt. This is how you would open data1.txt:

 open(​'data/data1.txt'​, ​'r'​)

Here, data/data1.txt is a relative path.

To look in the directory above the current working directory, you can use two dots:

 open(​'../data1.txt'​, ​'r'​)

You can chain them to go up multiple directories. Here, Python looks for data1.txt three directories above the current working directory and then down into a data directory:[6]

 open(​'../../../data/data1.txt'​, ​'r'​)