Basic query optimization

In order to start optimizing this query, we must go through what I have called previously initial considerations of query optimization. For the sake of this example, let's say that this table will be the object of READ queries more than WRITE queries because the data will remain fairly static once written to the table. Also, it is important to note that creating an index on the first_name column of the actor table will make the index prone to generating equivocity due to non-unique values in this column. Moreover, let's say that scalability is important because we intend on having many users querying this table every hour and that the table size should remain manageable on a long-term period.

This being given and known, we will therefore proceed to create an index on the first_name column of the actor table:

 # MariaDB > CREATE INDEX idx_first_name ON actor(first_name); 

Once done, MariaDB confirms the creation of the index:

Confirmation that the index was created

Now that the index has been created, we obtain this result when asking the database engine to EXPLAIN its execution plan:

The execution plan is now optimized

The type column's value is now ref, possible_keys is idx_first_name, key is idx_first_name, ref is const, rows is 1 and Extra is Using index condition. As we can see, the engine has now identified our newly created index as a possible key to use and then proceeds to use it. It uses the constant value given in our query to perform the lookup in the index and considers only one row when accessing the table. All of this is great but, as we expected in our initial considerations, the index is composed of non-unique values. The possible equivocity amongst values of the table column might lead to a degenerated index over time, hence the access type of ref and the extra information indicating that the engine is Using index condition, which means that the WHERE clause is pushed down to the table engine for optimization at the index level. In this example, with the admitted initial considerations, this is, in the absolute sense, the best query optimization that we can do, as it is impossible to get unique values in the first_name column of the actor table. But, in fact, there is a possible optimization depending on the domain use case. If we only wish to use the actor's first name, then we could further optimize the Using index condition in the Extra column by only selecting the appropriate column, thus allowing the database engine to only access the index:

 # MariaDB > EXPLAIN SELECT first_name FROM actor WHERE first_name = 'AL'; 

The database engine then confirms that it is only using the index in the Extra column:

The 'Extra' column now contains the information "Using where; Using index"

And, how does all of this translate into overall performance? Let's run a few benchmark tests in order to measure the effects of our changes.

First, we will run a benchmark test without the index. On the container's CLI, run the following command:

# mysqlslap --user=root --host=localhost --concurrency=1000 --number-of-queries=10000 --create-schema=sakila --query="SELECT * FROM actor WHERE first_name = 'AL';" --delimiter=";" --verbose --iterations=2 --debug-info;

Here are the results without the index:

The results of the benchmark test WITHOUT the use of the index

And, the results with the index:

The results of the benchmark test WITH the use of the index

And, finally, let's run the same command while only selecting the appropriate column, thus limiting the lookup to the index only:

# mysqlslap --user=root --host=localhost --concurrency=1000 --number-of-queries=10000 --create-schema=sakila --query="SELECT first_name FROM actor WHERE first_name = 'AL';" --delimiter=";" --verbose --iterations=2 --debug-info; 

Here are the results of this last benchmark test:

The results of the benchmark test WITH the use of the index ONLY

The benchmark test results clearly show that our query optimization did indeed satisfy our initial scalability assumption, especially if we were to see the table grow in size and our database become more popular with a growing number of users over time.