Operating Systems : brief overview
This is part 1/3
Read part 2 here and part 3 here.
Understanding Computers
A computer is any electro-mechanical device which takes input, does some processing on that input and gives an output. Ranging from an abacus to a macbook, anything that performs the above tasks, in order, can be termed as a computer.
Bus: Internal communication in a machine is performed using a Bus. It is the main pathway between a CPU and the main memory of the computer. A System Bus carries data to and from the input/output devices.
Universal Serial Bus or USB is an industry standard specification for cables, connectors and protocols for connection, communication and/or power supply between computers, peripherals and other computers.
CPU also known as the brain of the computer stands for Central Processing Unit. Generally, it is a small silicon chip, where millions an +d billions of transistors are embedded to work in harmony and process data. The CPU only understands Machine Language. It operates in a Fetch — Decode — Execute manner, where each instruction is fetched, decoded and executed one by one. The CPU, although very small, has it’s own memory with no latency often in the form of registers. Apart from registers, the CPU also holds some amount of cache memory for faster data retrieval.
Memory
There are 3 major characteristics of memory devices used in computing:
- Capacity
- Speed
- Cost
Registers, being the smallest in terms of capacity, are the fastest in terms of speed. While the main memory and external hard drives can hold the largest capacity but are comparatively slow in data retrieval. Although the speed and capacity is relative, this is the standard and generalized use in the industry. The cost of the memory device is always a trade-off between the capacity and speed. A more expensive memory device will be faster and have a larger capacity.
RAM: One can access any point in the Random Access Memory in the same amount of time (constant time). RAM is broken down into bytes for easier access. A major disadvantage of using RAM is that it always needs a constant supply of power i.e. every time it is disconnected from a power source, it looses all the data stored.
Secondary Storage Devices
Unlike RAM secondary storage devices do not need a constant supply of power to retain its data.
HDD: Hard Disk Drive is also called as the spinning drive because of its spinning disk. It has multiple disks containing data in magnetic strips. The data is located and retrieved using 3D polar coordinates on the disks. The working of a HDD is somewhat similar to gramophone. The size of a HDD can be in Terabytes (Tb). Although, they are slower when compared to SSDs.
SSD: Solid State Drives are a faster yet complex version of secondary storage devices. The data on a SSD is stored electrically on chips which makes them more expensive but comparatively faster than HDD.
Operating System
An operating system or an OS can be described as a program that controls the execution of application program and acts an an interface between applications and computer.
In laymen terms, the main purpose of an OS is to run programs and the efficiency of the program is judged by the number of programs being processed, the speed of the process and how well the OS is handling errors and interrupts. Let’s dive deeper and understand terms surrounding the OS.
Kernel
A kernel is a core component of any OS. It is mainly responsible for managing system resources. The kernel also assists applications with performance by acting like a memory management unit or performing process scheduling, amongst other things.
Batch Multi-programming
In batch multiprogramming, when one program finishes, the next scheduled program is run on the processor. This was popularized in main frames in older days unlike today.
Multi Programming
Multi programming is referred to situation where all programs that needs to be executed are loaded into the main memory, and the OS acts like a resource allocator to allocate CPU time, memory, etc.. The most popular type of multi programming in today’s environment is Time-sharing.
To understand multiprogramming, let us first understand processes in brief.
Processes
A program in its running state is often termed as process. It is a way to keep track of multiple instances of a program. A process is loaded into the main memory, scheduled, has access to files and networking connections among other tasks.
A process includes code, data and context which is stored in sequential memory space. It is created by the operating system to track the state of the running programming and the resources assigned to the program.
5 state process model
The given diagram is self explanatory, if you wish to learn the 5 state process model in depth there are many resources online including this.
Suspension
A process is said to be suspended when it is completely removed from the main memory. The process is stored in on the secondary storage device for future use, while saving the state of the process as it is. This act frees the main memory and lets other high priority process run first. The suspension and state change is controlled by medium term scheduling algorithm. An important thing to note here is that the process suspended is not aware of the same and is resumed later without any knowledge of the suspension. Suspension of a process maybe performed for the following reasons:
1. Debugging
2. Long Term Delay
3. Freeing main memory
The introduction of suspension allows us to move from 5 state process model to 7 state process model, with additional two states as Ready — Suspended and Blocked — Suspended. The new model looks as follows:
Process Control Block (P.C.B)
A process control block includes all the information required to run and control a given process. It takes care of different aspects of programming as follows:
- Memory — Uses a memory table — This is useful for paging and segmentation.
- Input / Output — Uses an I/O table — To make sure that i/o devices are accessible when needed
- File — Uses a file table — checks the file permissions and accessibility to the required files
- Call Stack — Uses a primary table — All active functions are tracked using process image.
A process control block or PCB has the following data stored:
- Process ID — 16 bit tracking ID. (ex: process ID for __init__ == 1)
- Parent Process ID — These IDs are used to keep a track of the parent process, how old a process is and whom to call when the process is complete. These IDs are also freed and reused later.
- User ID — Keeps track of the resource permissions given to the process / user.
- Registers — Saving state during suspension of a process
- Stack Pointers
- Scheduling
- Linkage — information related to linked processes
- Inter Process Communication (I.P.C.)
- Resources
- Memory
Kernel Mode vs User Mode
Kernel Mode: When a process is running in kernel mode, it can execute the code in any part of the system.
User Mode: Any bit of code, running in the User Mode, cannot directly access system hardware. Neither can it run CPU instruction since it doesn’t have the right permissions. A code running in User Mode cannot access memory outside its own scope.
To differentiate between processes running in Kernel Mode and User Mode, the PCB also stores a Program Status Word
PSW — Program Status Word — it is a single bit data to decide if the process is running in User mode or Kernel mode.
This was part 1 of my 3 part series on Operating systems. Part 2 will cover topics like context switching, threads, mutual exclusion, semaphores and deadlocks.In part 3 will dive deep into Memory management and talk about partitioning strategies, buddy system, paging, virtual memory and replacement strategies.Other topics that you is look into:
Lookup problem, Replacement policies, get more resources here.