This section will explore how to find the index of the two smallest items in an unsorted list using three quite different algorithms. We’ll go through a top-down design using each approach.
To start, suppose we have data showing the number of humpback whales sighted off the coast of British Columbia over the past ten years:
809 |
834 |
477 |
478 |
307 |
122 |
96 |
102 |
324 |
476 |
The first value, 809, represents the number of sightings ten years ago; the last one, 476, represents the number of sightings last year.
We’ll start with a simpler problem: what is the smallest value during those years? This code tells us just that:
| >>> counts = [809, 834, 477, 478, 307, 122, 96, 102, 324, 476] |
| >>> min(counts) |
| 96 |
If we want to know in which year the population bottomed out, we can use list.index to find the index of the smallest value:
| >>> counts = [809, 834, 477, 478, 307, 122, 96, 102, 324, 476] |
| >>> low = min(counts) |
| >>> counts.index(low) |
| 6 |
Or, more succinctly:
| >>> counts = [809, 834, 477, 478, 307, 122, 96, 102, 324, 476] |
| >>> counts.index(min(counts)) |
| 6 |
Now, what if we want to find the indices of the two smallest values? Lists don’t have a method to do this directly, so we’ll have to design an algorithm ourselves and then translate it to a Python function. Here is the header for a function that does this:
| from typing import List, Tuple |
| |
| def find_two_smallest(L: List[float]) -> Tuple[int, int]: |
| """Return a tuple of the indices of the two smallest values in list L. |
| |
| >>> items = [809, 834, 477, 478, 307, 122, 96, 102, 324, 476] |
| >>> find_two_smallest(items) |
| (6, 7) |
| >>> items == [809, 834, 477, 478, 307, 122, 96, 102, 324, 476] |
| True |
| """ |
As you may recall from Designing New Functions: A Recipe, the next step in the function design recipe is to write the function body.
There are at least three distinct algorithms, each of which will be subjected to top-down design. We’ll start by giving a high-level description of each. Each of these descriptions is the first step in doing a top-down design for that approach.
Find, remove, find. Find the index of the minimum, remove it from the list, and find the index of the new minimum item in the list. After we have the second index, we need to put back the value we removed and, if necessary, adjust the second index to account for that removal and reinsertion.
Sort, identify minimums, get indices. Sort the list, get the two smallest numbers, and then find their indices in the original list.
Walk through the list. Examine each value in the list in order, keep track of the two smallest values found so far, and update these values when a new smaller value is found.
The first two algorithms mutate the list, either by removing an item or by sorting the list. It is vital that our algorithms put things back the way we found them, or the people who call our functions are going to be annoyed with us. The last two lines of the docstring checks this for us.
While you are investigating these algorithms in the next few pages, consider this question: Which one is the fastest?
Here is the algorithm again, rewritten with one instruction per line and explicitly discussing the parameter L:
| from typing import List, Tuple |
| |
| def find_two_smallest(L: List[float]) -> Tuple[int, int]: |
| """Return a tuple of the indices of the two smallest values in list L. |
| |
| >>> items = [809, 834, 477, 478, 307, 122, 96, 102, 324, 476] |
| >>> find_two_smallest(items) |
| (6, 7) |
| >>> items == [809, 834, 477, 478, 307, 122, 96, 102, 324, 476] |
| True |
| """ |
| # Find the index of the minimum item in L |
| # Remove that item from the list |
| # Find the index of the new minimum item in the list |
| # Put the smallest item back in the list |
| # If necessary, adjust the second index |
| # Return the two indices |
To address the first step, Find the index of the minimum item in L, we skim the output produced by calling help(list) and find that there are no methods that do exactly that. We’ll refine it:
| def find_two_smallest(L): |
| """ (see above) """ |
| |
| # Get the minimum item in L <-- This line is new |
| # Find the index of that minimum item <-- This line is new |
| # Remove that item from the list |
| # Find the index of the new minimum item in the list |
| # Put the smallest item back in the list |
| # If necessary, adjust the second index |
| # Return the two indices |
Those first two statements match Python functions and methods: min does the first, and list.index does the second. (There are other ways; for example, we could have written a loop to do the search.)
We see that list.remove does the third statement, and the refinement of “Find the index of the new minimum item in the list” is also straightforward.
Notice that we’ve left some of our English statements in as comments, which makes it easier to understand the problem that each chunk of code solves:
| def find_two_smallest(L): |
| """ (see above) """ |
| |
| # Find the index of the minimum and remove that item |
| smallest = min(L) |
| min1 = L.index(smallest) |
| L.remove(smallest) |
| |
| # Find the index of the new minimum |
| next_smallest = min(L) |
| min2 = L.index(next_smallest) |
| |
| # Put the smallest item back in the list |
| # If necessary, adjust the second index |
| # Return the two indices |
Since we removed the smallest item, we need to put it back where it was. Because removing a value affects the indices of the following values, we might need to add 1 to min2 if the smallest item came before the second-smallest item:
| def find_two_smallest(L): |
| """ (see above) """ |
| |
| # Find the index of the minimum and remove that item |
| smallest = min(L) |
| min1 = L.index(smallest) |
| L.remove(smallest) |
| |
| # Find the index of the new minimum |
| next_smallest = min(L) |
| min2 = L.index(next_smallest) |
| |
| # Put smallest back into L |
| # Fix min2 in case it was affected by the removal and reinsertion: |
| # If min1 comes before min2, add 1 to min2 |
| # Return the two indices |
That’s enough refinement (finally!) to do it all in Python:
| from typing import List, Tuple |
| |
| def find_two_smallest(L: List[float]) -> Tuple[int, int]: |
| """Return a tuple of the indices of the two smallest values in list L. |
| |
| >>> items = [809, 834, 477, 478, 307, 122, 96, 102, 324, 476] |
| >>> find_two_smallest(items) |
| (6, 7) |
| >>> items == [809, 834, 477, 478, 307, 122, 96, 102, 324, 476] |
| True |
| """ |
| |
| # Find the index of the minimum and remove that item |
| smallest = min(L) |
| min1 = L.index(smallest) |
| L.remove(smallest) |
| |
| # Find the index of the new minimum |
| next_smallest = min(L) |
| min2 = L.index(next_smallest) |
| |
| # Put smallest back into L |
| L.insert(min1, smallest) |
| |
| # Fix min2 in case it was affected by the removal and reinsertion: |
| if min1 <= min2: |
| min2 += 1 |
| |
| return (min1, min2) |
That seems like a lot of thought and care, and it is. However, even if you go right to code, you’ll have to think through all those steps. By writing them down first, you have a better chance of getting it right with a minimum amount of work.
Here is the second algorithm rewritten with one instruction per line:
| from typing import List, Tuple |
| |
| def find_two_smallest(L: List[float]) -> Tuple[int, int]: |
| """Return a tuple of the indices of the two smallest values in list L. |
| |
| >>> items = [809, 834, 477, 478, 307, 122, 96, 102, 324, 476] |
| >>> find_two_smallest(items) |
| (6, 7) |
| >>> items == [809, 834, 477, 478, 307, 122, 96, 102, 324, 476] |
| True |
| """ |
| |
| # Sort a copy of L |
| # Get the two smallest numbers |
| # Find their indices in the original list L |
| # Return the two indices |
That looks straightforward; we can use built-in function sorted, which returns a copy of the list with the items in order from smallest to largest. We could have used method list.sort to sort L, but that breaks a fundamental rule: never mutate the contents of parameters unless the docstring says to.
| def find_two_smallest(L): |
| """ (see above) """ |
| |
| # Get a sorted copy of the list so that the two smallest items are at the |
| # front |
| temp_list = sorted(L) |
| smallest = temp_list[0] |
| next_smallest = temp_list[1] |
| |
| # Find their indices in the original list L |
| # Return the two indices |
Now we can find the indices and return them the same way we did in find-remove-find:
| from typing import List, Tuple |
| |
| def find_two_smallest(L: List[float]) -> Tuple[int, int]: |
| """Return a tuple of the indices of the two smallest values in list L. |
| |
| >>> items = [809, 834, 477, 478, 307, 122, 96, 102, 324, 476] |
| >>> find_two_smallest(items) |
| (6, 7) |
| >>> items == [809, 834, 477, 478, 307, 122, 96, 102, 324, 476] |
| True |
| """ |
| |
| # Get a sorted copy of the list so that the two smallest items are at the |
| # front |
| temp_list = sorted(L) |
| smallest = temp_list[0] |
| next_smallest = temp_list[1] |
| |
| # Find the indices in the original list L |
| min1 = L.index(smallest) |
| min2 = L.index(next_smallest) |
| |
| return (min1, min2) |
Our last algorithm starts the same way as for the first two:
| from typing import List, Tuple |
| |
| def find_two_smallest(L: List[float]) -> Tuple[int, int]: |
| """Return a tuple of the indices of the two smallest values in list L. |
| |
| >>> items = [809, 834, 477, 478, 307, 122, 96, 102, 324, 476] |
| >>> find_two_smallest(items) |
| (6, 7) |
| >>> items == [809, 834, 477, 478, 307, 122, 96, 102, 324, 476] |
| True |
| """ |
| |
| # Examine each value in the list in order |
| # Keep track of the indices of the two smallest values found so far |
| # Update the indices when a new smaller value is found |
| # Return the two indices |
We’ll move the second line before the first one because it describes the whole process; it isn’t a single step. Also, when we see phrases like each value, we think of iteration; the third line is part of that iteration, so we’ll indent it:
| def find_two_smallest(L): |
| """ (see above) """ |
| |
| # Keep track of the indices of the two smallest values found so far |
| # Examine each value in the list in order |
| # Update the indices when a new smaller value is found |
| # Return the two indices |
Every loop has three parts: an initialization section to set up the variables we’ll need, a loop condition, and a loop body. Here, the initialization will set up min1 and min2, which will be the indices of the smallest two items encountered so far. A natural choice is to set them to the first two items of the list:
| def find_two_smallest(L): |
| """ (see above) """ |
| |
| # Set min1 and min2 to the indices of the smallest and next-smallest |
| # values at the beginning of L |
| # Examine each value in the list in order |
| # Update the indices when a new smaller value is found |
| # Return the two indices |
We can turn that first line into a couple lines of code; we’ve left our English version in as a comment:
| def find_two_smallest(L): |
| """ (see above) """ |
| |
| # Set min1 and min2 to the indices of the smallest and next-smallest |
| # Values at the beginning of L |
| if L[0] < L[1]: |
| min1, min2 = 0, 1 |
| else: |
| min1, min2 = 1, 0 |
| |
| # Examine each value in the list in order |
| # Update the indices when a new smaller value is found |
| # Return the two indices |
We have a couple of choices now. We can iterate with a for loop over the values, a for loop over the indices, or a while loop over the indices. Since we’re trying to find indices and we want to look at all of the items in the list, we’ll use a for loop over the indices—and we’ll start at index 2 because we’ve examined the first two values already. At the same time, we’ll refine the statement in the body of the loop to mention min1 and min2.
| def find_two_smallest(L): |
| """ (see above) """ |
| |
| # Set min1 and min2 to the indices of the smallest and next-smallest |
| # values at the beginning of L |
| if L[0] < L[1]: |
| min1, min2 = 0, 1 |
| else: |
| min1, min2 = 1, 0 |
| |
| # Examine each value in the list in order |
| for i in range(2, len(values)): |
| # Update min1 and/or min2 when a new smaller value is found |
| # Return the two indices |
Now for the body of the loop. We’ll pick apart “update min1 and/or min2 when a new smaller value is found.” Here are the possibilities:
If L[i] is smaller than both min1 and min2, then we have a new smallest item; so min1 currently holds the second smallest, and min2 currently holds the third smallest. We need to update both of them.
If L[i] is larger than min1 and smaller than min2, we have a new second smallest.
If L[i] is larger than both, we skip it.
| def find_two_smallest(L): |
| """ (see above) """ |
| |
| # Set min1 and min2 to the indices of the smallest and next-smallest |
| # values at the beginning of L |
| if L[0] < L[1]: |
| min1, min2 = 0, 1 |
| else: |
| min1, min2 = 1, 0 |
| |
| # Examine each value in the list in order |
| for i in range(2, len(L)): |
| # |
| # L[i] is smaller than both min1 and min2, in between, or |
| # larger than both: |
| # If L[i] is smaller than min1 and min2, update them both |
| # If L[i] is in between, update min2 |
| # If L[i] is larger than both min1 and min2, skip it |
| |
| return (min1, min2) |
All of those are easily translated to Python; in fact, we don’t even need code for the “larger than both” case:
| from typing import List, Tuple |
| |
| def find_two_smallest(L: List[float]) -> Tuple[int, int]: |
| """Return a tuple of the indices of the two smallest values in list L. |
| |
| >>> items = [809, 834, 477, 478, 307, 122, 96, 102, 324, 476] |
| >>> find_two_smallest(items) |
| (6, 7) |
| >>> items == [809, 834, 477, 478, 307, 122, 96, 102, 324, 476] |
| True |
| """ |
| |
| # Set min1 and min2 to the indices of the smallest and next-smallest |
| # values at the beginning of L |
| if L[0] < L[1]: |
| min1, min2 = 0, 1 |
| else: |
| min1, min2 = 1, 0 |
| |
| # Examine each value in the list in order |
| for i in range(2, len(L)): |
| # L[i] is smaller than both min1 and min2, in between, or |
| # larger than both |
| |
| # New smallest? |
| if L[i] < L[min1]: |
| min2 = min1 |
| min1 = i |
| # New second smallest? |
| elif L[i] < L[min2]: |
| min2 = i |
| |
| return (min1, min2) |