What is multithreading?

Thread is short for a thread of execution. A programmer can split his or her work into threads that run simultaneously and share the same memory context. Unless your code depends on third-party resources, multithreading will not speed it up on a single-core processor, and will even add some overhead for thread management. Multithreading will benefit from a multiprocessor or multicore machines where each thread can be executed on a separate CPU core, thus making the program run faster. This is a general rule that should hold true for most programming languages. In Python, the performance benefit from multithreading on multicore CPUs has some limits, which we will discuss later. For the sake of simplicity, let's assume for now that this statement is also true for Python.

The fact that the same context is shared among threads means you must protect data from uncontrolled concurrent accesses. If two intertwined threads update the same data without any protection, there might be a situation where subtle timing variation in thread execution can alter the final result in an unexpected way. To better understand this problem, imagine that there are two threads that increment the value of a shared variable in a non-atomic sequence of steps, for example:

counter_value = shared_counter
shared_counter = counter_value + 1

Now, let's assume that the shared_counter variable has the initial value of 0. Now, imagine that two threads process the same code in parallel, as follows:

Thread 1

Thread 2

counter_value = shared_counter      # counter_value = 0
shared_counter = counter_value + 1  # shared_counter = 0 + 1

counter_value = shared_counter      # counter_value = 0
shared_counter = counter_value + 1  # shared_counter = 0 + 1

Depending on the exact timing and how processor context will be switched, it is possible that the result of running two such threads will be either 1 or 2. Such a situation is called a race hazard or race condition, and is one of the most hated culprits of hard to debug software bugs.

Lock mechanisms help in protecting data, and thread programming has always been a matter of making sure that the resources are accessed by threads in a safe way. But unwary usage of locks can introduce a set of new issues on its own. The worst problem occurs when, due to a wrong code design, two threads lock a resource and try to obtain a lock on the other resource that the other thread has locked before. They will wait for each other forever. This situation is called a deadlock, and is similarly hard to debug. Reentrant locks help a bit in this by making sure a thread doesn't get locked by attempting to lock a resource twice.

Nevertheless, when threads are used for isolated needs with tools that were built for them, they might increase the speed of the program.

Multithreading is usually supported at the system kernel level. When the machine has a single processor with a single core, the system uses a time slicing mechanism. Here, the CPU switches from one thread to another so fast that there is an illusion of threads running simultaneously. This is done at the processing level as well. Parallelism without multiple processing units is obviously virtual, and there is no performance gain from running multiple threads on such hardware. Anyway, sometimes, it is still useful to implement code with threads, even if it has to execute on a single core, and we will look at a possible use case later.

Everything changes when your execution environment has multiple processors or multiple CPU cores for its disposition. Even if time slicing is used, processes and threads are distributed among CPUs, providing the ability to run your program faster.

Let's take a look at how Python deals with threads.