This behavior--that index scans will flip into sequential ones once they access enough of a table--is one of the more surprising aspects to many people. Random disk access is so much slower than sequential scans that it makes sense to do so. But if all your data is cached in RAM (as is the case with the examples shown here), the advantages aren't as obvious. And that may very well always be the case with your data. In that case, you could even consider lowering database query parameters such as random_page_cost to make the optimizer understand your true situation.
This whole area is one of the things to be wary of when executing your own queries that expect to use indexes. You can test them out using some subset of your data, or perhaps using the initial production set. If you then deploy onto a system with more rows, or the production system expands its data size, these plans will change. What was once an indexed query might turn into an unindexed full sequential scan. What can be even more frustrating for developers is the opposite: a query coded against a trivial subset of the data might always use a sequential scan, just because the table is small and the selectivity of the index low. Run the same query against the production dataset, where neither is true anymore, and it could instead run using an index.
Whether an index will be used or not is a combination of both the query and the underlying table it's executing against, determined based on the statistics about the table. There is a mechanism that exists in PostgreSQL to manipulate the planning. In that mechanism, any scan can be disabled or enabled; and disabling and enabling of the scan affects the planning.
The following is the list of configurations that affect the query planning:
- enable_bitmapscan
- enable_hashagg
- enable_hashjoin
- enable_indexscan
- enable_indexonlyscan
- enable_material
- enable_mergejoin
- enable_nestloop
- enable_seqscan
- enable_sort
- enable_tidscan
Complete details of each can be found at PostgreSQL's official documentation:
http://www.postgresql.org/docs/current/static/runtime-config-query.html
In the case of the index scan, it can be forced by disabling sequential scans (enable_seq_sacns=false) and disabling nest-loops (enable_nestloop=false). If the index is still not used, the only reason behind this is the cost or the condition, which is not fulfilled for that index