File creation is extremely easy with Python: you simply create the variable that will represent the file, open the file, give it a filename, and tell Python that you want to write to it.
If you don't expressly tell Python that you want to write to a file, it will be opened in read-only mode. This acts as a safety feature to prevent you from accidentally overwriting files. In addition to the standard w to indicate writing and r for reading, Python supports several other file access modes:
- a: Appends all output to the end of the file; it does not overwrite information currently present. If the indicated file does not exist, it is created.
- r: Opens a file for input (reading). If the file does not exist, an IOError exception is raised.
- r+: Opens a file for input and output. If the file does not exist, causes an IOError exception.
- w: Opens a file for output (writing). If the file exists, it is overwritten. If the file does not exist, one is created.
- w+: Opens a file for input and output. If the file exists, it is overwritten; otherwise one is created.
- ab, rb, r+b, wb, w+b: Opens a file for binary, non-textual input or output. (Note: these modes are supported only on the Windows and macOS platforms. *nix systems don't care about the data type.)
When using standard files, most of the information will be alphanumeric in nature, hence the extra binary-mode file operations. Unless you have a specific need, this will be fine for most of your tasks.
A typical command to open a file to write to might look like this: open('data.txt', 'w'). An optional, third argument can be added for buffering control. If you used open('data.txt', 'w', 0), then the data would be immediately written to the file without being held temporarily in memory. This can speed up file operations at the expense of data integrity.
Here is a list of common Python file operations:
- output = open('/tmp/spam', 'w'): Create output file ('w' means write)
- input = open('data', 'r'): Create input file ('r' means read, and is the default file operation)
- append = open('file.txt', 'a'): Append more data to the end of the file without overwriting
- S = input.read(): Read entire file into a single string
- S = input.read(n): Read n number of bytes
- S = input.readline(): Read next line (through end-line marker)
- L = input.readlines(): Read entire file into list of line strings; note that this is different from read() in that readlines() splits the file into separate lines, placed into a list
- output.write(S): Write string S onto file
- output.writelines(L): Write all line strings in list L onto file
- output.close(): Manual close
Python has a built-in garbage collector, so you don't really need to manually close your files; once an object is no longer referenced within memory, the object's memory space is automatically reclaimed. This applies to all objects in Python, including files.
However, it's recommended to manually close files in large systems; it won't hurt anything and it's good to get into the habit in case you ever have to work in a language that doesn't have garbage collection. In addition, Python for other platforms, such as Jython or IronPython, may require you to manually close files to immediately free up resources, rather than waiting for garbage collection. Also, there are times when system operations have problems and a file is left open accidentally, resulting in a potential memory leak.
The location of the file you are working with can be indicated as either an absolute path (a specific location on a drive) or a relative path (the file location in relation to the current directory); if no path is provided, the current directory is assumed.