Suppose that we picture the logical address space of a process as the vertical axis on a graph. As time goes by we keep moving to the right at intervals and making a mark everywhere the instruction counter has been in that interval. We might end up with something. We could instead imagine that we unwound a thread and placed it on the graph instead of marking the space with a pencil. This is an analogy that gave rise to the phrase “ thread of execution. ” Now suppose that we stopped this process, and saved all the data that represented the CPU state in a table (we might call it a thread control block, or TCB). Then further suppose we started the process over from the beginning. Again we let it run for a time, then stopped it and saved the CPU state in another TCB.
We could now go back and restore the saved CPU state of the first thread and resume its execution. How would this be different from running multiple processes? There are several ways that using multiple threads can be better than using multiple processes. For one thing, we only have one copy of the program and data in memory, so we have more memory to use for other things.
For another thing, there is a much smaller overhead to switch between threads than to switch between processes since we are only saving the CPU state. Generally, this means saving only a few registers (including the instruction pointer) and the pointer to the stack space. On some computers, this can be done with a single instruction. We are not saving accounting data, OS scheduling information, execution statistics.
Actually, when we start a second thread we don’t really start it at the beginning of the process. Recall that we just said that the various threads share a single copy of the program code and the data that the process owns. Normally, one of the first things a process does is to initialize data tables. Since the first thread has already done this setup we don’t want the second thread to redo it.
More to the point, the startup of a second thread is not something that the OS does on its own. It is initiated by the running process in order to let the system do more work on behalf of the process without incurring that heavy overhead of a full process context switch.
For this reason, a thread is sometimes called a lightweight process. As a process is running it will reach a point where there is some additional work that can be done in parallel with the work the main thread is doing, so the process (parent thread) will start a child thread to do that extra work. We see an example of two threads in a single process. The first thread is shown as a solid line. At some point, it calls the OS to start a second thread, shown here as a dotted line. Eventually, the OS switches back.
One example of how threads work can be seen in a word processing program. As this is being written a check shows that the word processor has 18 threads running. Some are fairly obvious, but it is hard to come up with 18:
- foreground keystroke handling
- display updating
- spelling checker
- grammar checker
- “smart tag” recognition
- periodic file save
Another example is commonly seen in server applications such as a Web server. One thread will wait for incoming HTTP requests. The thread will do (at least these) several steps:
- Parse the incoming request
- Look up the requested file
- Read the page
- Format it for output
- Request the transmission of the page
This sequence keeps the handling of each request in a separate thread of execution and makes the program logic much simpler than having a single process keep track of the state of hundreds or thousands of individual requests.
User-level threads versus kernel-level threads
Historically, the use of multiple processes came before the idea of threads. When programmers realized that switching between processes used so many resources and slowed things down so much they begin to develop the concept of threads. However, the OS of the time did not have threads built into them. So the original development of threads was done as a set of library subroutines.
Of course, this meant that the entire thread package ran in user mode and the OS was unaware that an application was trying to keep multiple activities running in parallel. Accordingly, if any of the threads in a process made a system call that would block for some reason, the entire application, including all the threads of that application, would be blocked at the same time. Such a thread package is referred to as a user-level thread package because it runs entirely in user mode. Designing programs that utilize such user thread libraries must, therefore, be done very carefully so that one thread does not put the entire process to sleep.