Lists are the least object-oriented of Python's data structures. While lists are, themselves, objects, there is a lot of syntax in Python to make using them as painless as possible. Unlike many other object-oriented languages, lists in Python are simply available. We don't need to import them and rarely need to call methods on them. We can loop over a list without explicitly requesting an iterator object, and we can construct a list (as with a dictionary) with custom syntax. Further, list comprehensions and generator expressions turn them into a veritable Swiss Army knife of computing functionality.
We won't go into too much detail of the syntax; you've seen it in introductory tutorials across the web and in previous examples in this book. You can't code Python for very long without learning how to use lists! Instead, we'll be covering when lists should be used, and their nature as objects. If you don't know how to create or append to a list, how to retrieve items from a list, or what slice notation is, I direct you to the official Python tutorial, posthaste. It can be found online at http://docs.python.org/3/tutorial/.
In Python, lists should normally be used when we want to store several instances of the same type of object; lists of strings or lists of numbers; most often, lists of objects we've defined ourselves. Lists should always be used when we want to store items in some kind of order. Often, this is the order in which they were inserted, but they can also be sorted by other criteria.
As we saw in the case study from the previous chapter, lists are also very useful when we need to modify the contents: insert to, or delete from, an arbitrary location of the list, or update a value within the list.
Like dictionaries, Python lists use an extremely efficient and well-tuned internal data structure so we can worry about what we're storing, rather than how we're storing it. Many object-oriented languages provide different data structures for queues, stacks, linked lists, and array-based lists. Python does provide special instances of some of these classes, if optimizing access to huge sets of data is required. Normally, however, the list data structure can serve all these purposes at once, and the coder has complete control over how they access it.
Don't use lists for collecting different attributes of individual items. We do not want, for example, a list of the properties a particular shape has. Tuples, named tuples, dictionaries, and objects would all be more suitable for this purpose. In some languages, they might create a list in which each alternate item is a different type; for example, they might write ['a', 1, 'b', 3] for our letter frequency list. They'd have to use a strange loop that accesses two elements in the list at once or a modulus operator to determine which position was being accessed.
Don't do this in Python. We can group related items together using a dictionary, as we did in the previous section, or using a list of tuples. Here's a rather convoluted counter-example that demonstrates how we could perform the frequency example using a list. It is much more complicated than the dictionary examples, and illustrates the effect choosing the right (or wrong) data structure can have on the readability of our code. This is demonstrated as follows:
import string CHARACTERS = list(string.ascii_letters) + [" "] def letter_frequency(sentence): frequencies = [(c, 0) for c in CHARACTERS] for letter in sentence: index = CHARACTERS.index(letter) frequencies[index] = (letter,frequencies[index][1]+1) return frequencies
This code starts with a list of possible characters. The string.ascii_letters attribute provides a string of all the letters, lowercase and uppercase, in order. We convert this to a list, and then use list concatenation (the + operator causes two lists to be merged into one) to add one more character, a space. These are the available characters in our frequency list (the code would break if we tried to add a letter that wasn't in the list, but an exception handler could solve this).
The first line inside the function uses a list comprehension to turn the CHARACTERS list into a list of tuples. List comprehensions are an important, non-object-oriented tool in Python; we'll be covering them in detail in the next chapter.
Then, we loop over each of the characters in the sentence. We first look up the index of the character in the CHARACTERS list, which we know has the same index in our frequencies list, since we just created the second list from the first. We then update that index in the frequencies list by creating a new tuple, discarding the original one. Aside from garbage collection and memory waste concerns, this is rather difficult to read!
Like dictionaries, lists are objects too, and they have several methods that can be invoked upon them. Here are some common ones:
- The append(element) method adds an element to the end of the list
- The insert(index, element) method inserts an item at a specific position
- The count(element) method tells us how many times an element appears in the list
- The index()method tells us the index of an item in the list, raising an exception if it can't find it
- The find()method does the same thing, but returns -1 instead of raising an exception for missing items
- The reverse() method does exactly what it says—turns the list around
- The sort() method has some rather intricate object-oriented behaviors, which we'll cover now