Computer Science 101 In-Depth — Operating Systems Part 1: Processes and Scheduling

13 min readFeb 15, 2020

This is me writing on topics that I think make up Computer Science 101. I hope it can be helpful to people. So without muchado, let’s get started on Operating Systems.

What Is An Operating System?

An Operating System does two things:
(1) a resource allocator, meaning it manages all computer resources. It decides between conflicting requests for an efficient and fair use of resources.
(2) a control program, meaning it controls the execution of programs so as to prevent errors and the improper use of the computer.

Generally, an operating system also has two main components: a kernel which is the core of the OS and has control over everything in the system and a shell (e.g. CMD) which is the interface to access the services provided by the OS.

For a longer visual understanding of operating systems, you can check out this CrashCourse video here. Otherwise, let’s get to processes and scheduling.

Processes and scheduling

What is a process?

People regurgitate that a process “is simply a program in execution”. A process is an instance of a program being run. A visual example of processes can be seen on the MacOS’ Activity Monitor for instance:

Activity Monitor on Mac where there are multiple processes being run

If you open your own, you can see that a ‘software program’ can actually have multiple processes. A process in general has an image of the executable machine code associated with a program, some memory, descriptors of resources allocated to the process, security attributes such as the process owner and permissions, and processor state.

As a process executes, it changes states. To ensure maximal productivity, the OS needs to understand the state of all of these processes and be able to intelligently create new ones, switch between them, pause them, suspend them etc.

A process goes from **new** (being created) into **ready** (waiting to be assigned to a processor), **running** (instructions being executed), **waiting** (for some event to occur) and then ending with it being **terminated**.

A program is a passive entity stored on disk (as an executable file). A process is active.

A program becomes a process when executable files are loaded into memory. The execution of a program occurs through the shell (whether that be a GUI e.g. mouseclick or otherwise).

The word ‘memory’ is key here.

A process owns resources.

The stack is used for local variables and the heap is used for dynamic memory allocation. For a more detailed explanation.

For a brief explanation of what a stack and heap means. Jeff Hill writes:

- a stack is the memory set aside as scratch space for a thread of execution. When a function is called, a block is reserved on the top of the stack for local variables and some bookkeeping data. When that function returns, the block becomes unused and can be used the next time a function is called. The stack is always reserved in a LIFO (last in first out) order; the most recently reserved block is always the next block to be freed. This makes it really simple to keep track of the stack; freeing a block from the stack is nothing more than adjusting one pointer.

- a heap is memory set aside for dynamic allocation. Unlike the stack, there’s no enforced pattern to the allocation and deallocation of blocks from the heap; you can allocate a block at any time and free it at any time. This makes it much more complex to keep track of which parts of the heap are allocated or free at any given time; there are many custom heap allocators available to tune heap performance for different usage patterns.

Each thread gets a stack, while there’s typically only one heap for the application (although it isn’t uncommon to have multiple heaps for different types of allocation).

A stack is managed and optimised by the CPU quite closely. What is key to note however is that when a function exists, all of its variables are popped off the stack (and hence lost forever). The heap is not as tightly managed by the CPU. It is a free-floating region of memory (and is larger), so it is useful if you need to allocate a large block of memory (e.g. a large array, or a big struct), and you need to keep that variable around a long time (like a global). But you must manage memory i.e. you’re in charge of allocating and freeing variables. If you don’t free() your memory (deallocating it when you don’t need it any more), the program will have a memory leak. This means that memory on the heap will still be set aside (and won't be available to other processes).

Why does this matter?

For an operating system to do its job properly and maximise productivity, the operating system manages a Process Control Block (PCB) which helps it manage different processes effectively, by saving the current state in the PCB, switching between them and reloading certain processes later on

The PCB has the information associated with each process.

Process state: running, waiting, etc
Program counter: location of instruction to next execute
CPU registers: contents of all process-centric registers CPU scheduling information- priorities, scheduling queue pointers
Memory-management information: memory allocated to the process
Accounting information: CPU used, clock time elapsed since start, time limits
I/O status information: I/O devices allocated to process, list of open files

How does the operating switch between different processes?

The operating system changes which process is running through context switching. When a CPU switch to another process, the system must save the state of the old process and load the saved state for the new process via a context switch. The context of a process is represented in the PCB.

A context switch takes a bit of time however. This context switch time is an overhead where the system does no useful work while switching. The more complex the operating system and the PCB means the longer the context switch. [The time depends on the hardware support; some hardware provides multiple sets of registers per CPU meaning that multiple contexts can be loaded at once].

Context switching has overheads. Why context switch?

Context switching allows for the illusion of several applications running concurrently even on a single CPU. It allows for multiple processes to be running on a machine. You can run several apps without needing to close them e.g. running Chrome at the same time as Spotify.

Context switching allows for more work to be done: I/O bound processes can run at the same time as CPU bound ones. Most processes only require short CPU bursts anyway, so switching between several is a good solution.

Process Scheduling

The process scheduler selects among available processes for next execution on CPU. It maintains scheduling queue of processes:

There is a job queue which are all processes in the system, processes in the ready queue (set of all processes residing in main memory, ready and waiting to execute) and device queue (set of processes waiting for an I/O device).

Scheduling can be broken down into two separate stages, long and short-term.

Long-term scheduler (job scheduler): selects which processes should be brought into the ready queue; it is invoked infrequently (seconds, minutes) meaning it may be slow; and determines how many processes can run at once (controls the degree of multiprogramming).
Short-term scheduler (CPU scheduler): selects which process should be executed next and allocates CPU. Sometimes the only scheduler in a system. The short-term scheduler is invoked frequently (milliseconds) so must be fast.

Processes can be described as either:

I/O-bound process — spends more time doing I/O than computations, many short CPU bursts
CPU-bound process — spends more time doing computations; few very long CPU bursts

Everything starts on the ready queue when added by the long-term scheduler, executes on the CPU for some time, then is moved to other queues as appropriate. The long-term scheduler strives for a good mix of processes.

CPU burst time: typically processes have short bursts of CPU usage interspersed with I/O access. When we talk about the ‘duration’ of a process we are scheduling we will mean the duration of the next CPU burst (and that will be all we consider scheduling).

Maximum CPU utilisation is obtained with multiprogramming. The CPU–I/O Burst Cycle seen above consists of a cycle of CPU execution and I/O wait CPU burst followed by I/O burst. The CPU burst distribution is the main concern.

The short-term scheduler selects from among the processes in ready queue, and allocates the CPU to one of them. The queue may be ordered in various ways.

CPU scheduling decisions may take place when a process:
1. Switches from running to waiting state
2. Switches from running to ready state
3. Switches from waiting to ready
4. Terminates
 Scheduling under 1 and 4 is nonpreemptive
 All other scheduling is preemptive (consider access to shared data; consider preemption while in kernel mode; consider interrupts occurring during crucial OS activities)

CPU Scheduling Algorithms (Short term scheduler)

Efficient CPU scheduling algorithms are essential for a good utilisation of the CPU and responsiveness of the operating system.

Different algorithms are better for different types of processes depending on duration (CPU bound/IO bound).

Criteria for Scheduling Algorithm Performance

CPU Utilisation: the percentage of the time the CPU is executing a process.
Waiting time: the total time a process spends waiting in the ready queue.
Turnaround time: the time between arrival of a process in the ready queue and it completing execution.
Response time: the time between arrival of the process and the production of the first response (depends on how long a process takes to make a response, it may not need to complete in order to do so).
Throughput: the number of processes completed per unit time (clearly affected by process duration).

We can also ask for average (mean) waiting/turnaround/response time or total (sum) waiting/turnaround/response time, or maximum/minimum waiting/turnaround/response time for a set of processes.

Scheduling Algorithms

First Come First Serve (FCFS)

The first process that comes first is scheduled to run first.

Only one context switch. Order sensitive.

The advantages of the FCFS algorithm:
+ simple
+ only one context switch

The negatives of the FCFS algorithm:
- average waiting time typically poor.
- average waiting time highly variable: dependent on order in which processes arrive.
- CPU Bound processes can hog the CPU (imagine a system with one CPU bound process and many I/O bound processes)

Shortest Job First (SJF)

Associate with each process the length of its next CPU burst. Use these lengths to schedule the process with the shortest time. SJF is optimal — gives minimum average waiting time for a given set of processes.

The difficulty is knowing the length of the next CPU request… perhaps ask the user?

The shortest job (that has arrived) runs first

Shortest Remaining Time First (SRTF)

The shortest remaining time first algorithm is a pre-emptive shortest job first algorithm.

The advantages of the SJF algorithm:
+ Short processes do not have to wait for long processes to complete before executing (especially if pre-emptive scheduling is used). That is, CPU Bound processes cannot hog the CPU.
+ Maximises throughput (for this imagine a queue where processes continually arrive).
+ Average waiting time is smaller (for the same set of processes) since no process waits for the longest to complete.

The disadvantages of the SJF algorithm:
− Risk of starvation for long processes; or alternatively long waiting times.
− Multiple context switches for each process if pre-emptive.
− Can we reliably estimate process duration before execution?
− Overheads in inserting processes in a sorted queue.

Priority Scheduling

Associate a priority with each process. A priority number (integer) is associated with each process. The lower number is a higher priority. Schedule the process with the highest priority first.

Priority scheduling can be pre-emptive or non-pre-emptive.

SRTF is a special case of priority scheduling where the priority is equal to the remaining execution time of the process.

A problem from this is starvation — low priority processes may never execute
A solution is ageing — as time progresses increase the priority of the process

The advantages of priority scheduling are:
+ Short waiting times for high priority processes (e.g. interactive user programs).
+ Users (or admin) have some control over the scheduler.
+ Deadlines can be met by giving processes high priority.

The disadvantages of priority scheduling are:
− Risk of starvation for low priority processes; or alternatively long waiting times. [although starvation risk can be alleviated by using an ageing scheme: gradually increase the priority of a process over time]
− Multiple context switches for each process if preemptive.
− Overheads in inserting processes in a sorted queue.

Round Robin Scheduling (RR)

The round robin scheduling algorithm features a time quantum: the length of time each process is allowed to run for (usually 10–100ms). When a process has run for this length of time it is preempted and another process from the ready queue can be scheduled.

Performance varies according to q:

If q is large round robin tends to First Come First Served;
If q is small, context switching overheads become significant.
A heuristic for efficiency: a quantum that is longer than 80% of CPU bursts should be chosen.

The advantages of a round robin algorithm:
+ No Starvation.
+ Each process must wait no more than (n-1) * q time units before beginning execution.
+ Fair sharing of CPU between processes.

The disadvantages of a round robin algorithm:
− Poor average response time
− Waiting time depends on number of processes, rather than priority.
− Particularly large number of context switches for each process so large overhead;
− Hard to meet deadlines because waiting times are generally high.

Multi Level Queue Scheduling

The idea is to partition the ready queue in to several queues: each processes is assigned to a specific queue; each queue has a different scheduling algorithm. The ready queue is partitioned into separate queues, eg: foreground (interactive) and background (batch). A process is permanently in a given queue.

Each queue has its own scheduling algorithm:
foreground — RR
background — FCFS

There also needs to be a scheduling algorithm to choose between the queues. This is typically fixed priority pre-emptive. Sometimes foreground may have absolute priority; background processes only run when the foreground queue is empty.

Scheduling must be done between the queues: Fixed priority scheduling; (i.e., serve all from foreground then from background). Possibility of starvation.

So we time slice — each queue gets a certain amount of CPU time which it can schedule amongst its processes; i.e., 80% to foreground in RR 20% to background in FCFS.

If processes can be divided into groups we can use multi-level queue scheduling: — e.g. 1 finite number of priorities, one queue for each priority level; — e.g. 2 a queue for foreground processes (e.g. web browser) and one for background processes (e.g. disk defragmenter).

Different response time needs => different scheduling needs.

Multi-Level Feedback Queue Scheduling

Like Multi-Level Queue Scheduling but processes can move between queues.

Ageing can be implemented in priority scheduling. In addition to the queues and their scheduling algorithms for multi-level queue scheduling we need:
— a mechanism for deciding when to promote processes
— a mechanism for deciding when to demote processes
— a mechanism for deciding which queue a process will join initially.

Let’s go through an example:

Here there are three queues:

Q0: RR with time quantum 8 milliseconds/time units
Q1: RR time quantum 16 milliseconds/time units
Q2: FCFS

All new jobs enter the top queue (RR q=8). If it not complete after 8 milliseconds of execution they move to the next (RR q = 16); • If still not complete after 16 milliseconds of execution the move to the final queue which operates FCFS.

Priority scheduling can be introduced. Create n queues, one for each priority level; higher priority queues get allocated more CPU time. Each process enters in the queue corresponding to its priority. After a process has been in the system for some time, its priority is increased so that it moves to a higher priority queue.

Multi-Processor Scheduling

An OS could have multiple CPUs used to share the processors load. This problem can be complex if the CPUs have different capabilities (so only certain jobs can be run on certain CPUs).

There are 2 variants of multi-processor scheduling:

Asymmetric multiprocessing (one processor controls all scheduling)

In asymmetric multiprocessing, one processor makes all scheduling decisions, and handles I/O processing, and other system activities .

The other processors execute user code only.

The need for data sharing is reduced because only one processor handles all system processes.

Symmetric multi-processing (each processor schedules itself)

All major modern OS support symmetric multi-processing. In symmetric multi-processing, each processor does its own scheduling.

There may be a single ready queue, or a queue for each processor. Whichever is used, each processor selects a process to execute from the ready queue.

For an efficient CPU use (high utilisation) load balancing is required for even distribution. There are two approaches to do so:

• Push migration: a specific process regularly checks the load on each processor and redistributes waiting processes.

• Pull migration: an idle processor pulls a waiting job from the queue of a busy processor.

Multiprocessing uses multiple (physical) CPUs to run processes in parallel.

Hyperthreading (aka Symmetric Multithreading) allows the same thing but using multiple logical processors.

Logical processors subdivide CPU power to enable parallel processing

Logical processors share the hardware of the CPU (cache, busses etc) and are responsible for their own interrupt handling. Logical processors make it appear as if there are 2 CPUs instead of one: can do Floating point computations for one process, whilst doing Arithmetical/Logical computations for another.