Boot Process
Linux Boot Process
Understanding the boot process will help you understand how the hardware and software is working together, and also will give you the required information to begin troubleshooting a booting problem you have.
Main Stages (short version)
(sysVinit)
BIOS -- Basic Input/Output System; executes MBR.
- System starts BIOS located on flash memory.
MBR -- Master Boot Record; executes GRUB.
- MBR contains the GRUB bootloader.
GRUB -- Grand Unified Bootloader; executes the Kernel.
- GRUB goes through two stages.
- After the two stages, the kernel and initrd is loaded into memory.
- Config files:
- GRUB 1 (old vers) /etc/grub.conf -> /boot/grub/grub.conf
- GRUB 2 (New Grub) /etc/grub2.cfg -> /boot/grub2/grub.cfg
Kernel -- The kernel; executes /sbin/init (SysVinit) or systemd (for newer, systemd systems).
- Kernel starts first process (pid 1) init or systemd.
- (pid 0 is the scheduler. There are two tasks with specially distinguished process IDs: swapper or sched has process ID 0 and is responsible for paging, and is actually part of the kernel rather than a normal user-mode process.)
(SysVinit) Init -- Init; executes runlevel programs
- Looks in the /etc/inittab file for runlevels.
- RunLevel - Runlevel programs are executed from /etc/rc.d/rc*.d/
(SystemD) Systemd -- inittab is no longer used when using systemd. Systemd is a system and service manager for Linux operating systems. When run as first process on boot (as PID 1), it acts as init system that brings up and maintains userspace services. Separate instances are started for logged-in users to start their services.
- systemd provides a compatibility layer that maps runlevels to targets, and associated binaries like runleve
The Boot Process in More Detail
Step 1: Power Supply and SMPS
Switching Mode Power Supply (SMPS)
SMPS converts AC (alternating current) to DC (direct current) because the computer internals work in DC. It maintains the required voltage level so that the computer can work properly.
Primary objective is to provide the correct voltage level to the motherboard and other components.
How it works: it checks the voltage level its providing to the motherboard, if the signal is perfect, it then sends a ‘POWER GOOD’ signal to the mobo timer. The mobo timer will stop sending ‘reset’ signal to the CPU, and the computer can then boot.
Step 2: Bootstrapping
Bootstrapping
In general this term usually refers to a self-starting process that is supposed to proceed without external input.
This address location in ROM (read-only memory) is always constant in x86 based systems, ‘FFF:000h’, which is the last region of the ROM. It contains one instruction: to ‘JUMP’ to another memory address location.
This ‘JUMP’ command tells the computer where the BIOS program is located in the ROM.
- The BIOS (Basic Input/Output System) is stored on a ROM chip on the mobo which contains code that tells the CPU how to interact and control the other components in the computer. In modern systems, the BIOS contents are stored on a flash memory chip so that the contents can be rewritten. The read-only here is about the chip being non-volatile, the contents of the memory stays when the power is cut off.
The computer knows how to bring itself up when you press the start button because of these instructions that are fed to the BIOS program.
Step 3: The Role of BIOS in Booting the Process
The Role of BIOS in Booting the Process
The most important use of BIOS during the booting process is POST (Power On Self Test) and to execute the MBR bootloader.
- POST is a series of tests conducted by the BIOS which confirms the proper functioning of different hardware components attached to the computer.
POST checks and confirms the integrity of things such as:
- Timer IC’s (chip used in a variety of timer, pulse generation, and oscillator applications)
- DMA controllers (Direct memory access is a method that allows an input/output (I/O) device to send or receive data directly to or from the main memory, bypassing the CPU to speed up memory operations.)
- CPU
- Video ROM
- Mobo
- Keyboard
- Printer Port
- Hard drives etc.
Warm startup, means a reset of a running machine, the BIOS will NOT conduct a full POST check.
Cold start, means you have applied the power now, it will conduct a full POST check.
CMOS vs BIOS, don’t get it twisted, two different/separate things (duh)
- CMOS (complementary metal-oxide-semiconductor) is a small memory RAM chip that is on the mobo. It is a battery operated memory chip on the mobo that stores time, date, and critical system information. Unlike RAM, CMOS RAM does not lose its memory when a computer is turned off due to the CMOS battery.
- Removing a CMOS battery will make the CMOS forget all the configs you have saved previously.
- This is why you can unlock a computer that is protected by a CMOS password, simply by removing the CMOS battery.
- Removing the CMOS battery will also make the OS show you the wrong time (system time consistency is maintained in CMOS settings).
- So when you modify the “BIOS settings” you are actually modifying the CMOS settings. CMOS settings is the place where you modify the boot order etc.
- BIOS settings cannot be altered by the USER, it requires a flash (duh).
- So, the BIOS is what you use to alter CMOS settings because the BIOS tells the CPU how to interact with the other components.
Once the POST check is completed successfully, BIOS will look at the CMOS settings for the boot order (boot loader program).
- Boot order is nothing but a user defined order which tells where to look for the OS
- The BIOS will look at the first thing listed in the list to check whether an OS can be loaded from there, if it does not find a bootable disk in the there, it will check the next one in the list, etc.
Once the boot loader program is detected and loaded into the memory, BIOS gives the control to it. This is when it executes the MBR boot loader.
Step 4: MRB and GRUB
BIOS is programmed to look at the boot sector, or the MBR (Master Boot Record)
- This is the first sector of the hard disk, which contains the program that will help the system to load the OS.
- Typically in /dev/sda, /dev/hda
- As soon as BIOS finds a valid MBR, it will load the entire content of MBR to RAM, and then further execution is done by the content of MBR.
- The first sector of the hard disk is 512 Bytes. The OS stores only the first stage of their boot loader program in here, because it is such a small amount of space. 440 Bytes is the boot loader program (first stage of the GRUB), the rest is used to store partition table information.
- The MBR contains the following things:
- First stage GRUB/boot loader - primary job is to load the second stage boot loader (stage 2 GRUB).
- Partition table information (and error messages)
- Magic Number
- 2 bytes, serves as a method of verification/validation for the MBR. This 2 byte magic number will contain values that will be something like AA55. A different magic number indicates a corrupted MBR or invalid MBR.
(In simple terms, MBR loads and executes the GRUB boot loader.)
GRUB = Grand Unified Bootloader
If you have multiple kernel images installed on your system, you can choose which one to be executed.
Grub displays a splash screen, waits for a few seconds, if you don’t enter anything, it loads the default kernel image as specified in the grub configuration file.
GRUB has the knowledge of the filesystem (the older Linux loader LILO didn’t understand filesystem)??? What do you mean by that. Reseach this
LILO is a Linux boot loader which is too big to fit into single sector of 512 Bytes.
- Has two parts:
- Installer module places the runtime module on MBR.
- Runtime module has the info about all operating systems installed.When this is executed it selected the OS to load and transfers the control to kernel.
- LILO does not understand filesystems and boot images to be loaded and treats them as raw disk offsets.
GRUB pretty much has three steps.
- GRUB stage 1
- The first sector (sector 0, MBR) contains GRUB stage 1.
- GRUB stage 1 will load GRUB stage 1.5 to the RAM, and will pass the control to it.
- GRUB stage 1.5 (MBR GAP)
- Normally partitions will start from sector number 63 (we have sectors from 1 - 63 free).
- This free space between MBR and the beginning of the partitions (1 - 63) is called MBR GAP. This is where GRUB stage 1.5 resides.
- Contains the drivers for reading the file systems (kernel file location and name, its partition). This is how the GRUB access those kernel files.
- GRUB stage 1.5 will load the file system drivers. Once the file system drivers are loaded, it can now access /boot/grub/grub.conf file which contains other details about kernel path, name, partition and initrd path etc.
- The following is sample grub.conf of CentOS notice it contains kernel and initrd image:
#boot=/dev/sda default=0 timeout=5 splashimage=(hd0,0)/boot/grub/splash.xpm.gz hiddenmenu title CentOS (2.6.18-194.el5PAE) root (hd0,0) kernel /boot/vmlinuz-2.6.18-194.el5PAE ro root=LABEL=/ initrd /boot/initrd-2.6.18-194.el5PAE.img
- Now this is where you are presented with the TUI (terminal user interface), where you can select your operating system kernel and press enter to boot it.
- GRUB stage 2
- loads the kernel and other initrd image files
- Initrd - The initial RAM disk is an initial root file system that is mounted prior to when the real root file system is available/mounted. The initrd is bound to the kernel and loaded as part of the kernel boot procedure.
- loads the kernel and other initrd image files
(In simple terms GRUB just loads and executes kernel and initrd images)
Step 5: Loading the Kernel Image
The Linux kernel is responsible for handling Process management, memory management, Users, Inter process communication etc. to maintain a good environment for programs to run.
In this step, the kernel mounts the root file system.
The kernel is a compressed image file. The location of this compressed kernel image is specified in GRUB 2 configuration file. Its basically an executable bzimage file.
/boot/grub2/grub.cfg, generated during Linux installation and regenerated when a new kernel is installed.
Initrd (initial ram disk) is needed to provide most of the drivers and tools along with small similar root file systems because we need our kernel image file to stay small.
Initrd image file and kernel image file can be found in the /boot directory. (gzip files)
The contents of initrd image file contains folders similar to the Linux directory structure, (There is /etc, /lib, /sbin, etc.). This is the small root file system that the kernel loads as a temporary root file system before the real root file system is loaded.
Loading and unloading of kernel modules is done with the help of programs like insmod, and rmmod present in the initrd image.
Now the kernel is loaded into the memory.
The kernel conducts a lot of hardware specific operations, checks the processor family and architecture, and the first user space program it executes is /sbin/init.
Step 5B. SysVinit
/sbin/init is the first program executed by the kernel. It has the PID of 1. (*key step, ps-ef | grep ‘init’)
Now as soon as the kernel executes the init process (/sbin/init) it will look at the /etc/inittab configuration file to see the default run level.
- Execute ‘grep initdefault /etc/inittab’ on your system to see your default run level.
- Note: inittab is no longer used when using systemd.
There are different run level’s in Linux:
Run-Level | Usage ---------------------- 0 System Halt/Shut Down 1 Single User Mode 2 Multiuser Mode Without Networking 3 Full Multiuser Mode 4 Unused 5 GUI/X11 6 Reboot
Typically you would set the default run level to either 3 or 5. 0 or 6 if you want to get in trouble.
We can set in which runlevel we want to run our operating system by defining it on /etc/inittab file.
The /etc/inttab file contains the default run level: id:3:initdefault:, means we run level 3 as the default run level. Once this is identified, the kernel loads all appropriate programs (the run level specific programs). From this you will get the directories like rc0.d, rc1.d, rc2.d, etc. in the /etc/rc.d.
- These rc.d folders contains run level specific programs that will be executed depending upon the default run level you have in you /etc/inittab config file.
- These rc.d folders in /etc/rc.d contains files that either begin with S or K, and are also numbered.
- The number after S/K is the sequence which these will be executed.
- Files beginning with S will be executed during the start up process.
- Files beginning with K will be killed during shutdown process.
- When the Linux system is booting up, you might see various services getting started like “Starting sendmail…. OK”; those are the run level programs.
Step 5B. Systemd
- Under construction
Once the kernel has started all programs in your desired run level directory, you will get the login screen.
Troubleshooting EC2 instances failing to boot up or failing instance status checks when launched
Things to first check if an EC2 instance is failing to instance status checks due to not booting up properly:
- Confirm if dhclient is installed.
- Check if the network configuration file is present, and what is the content.
$ ls -al /etc/sysconfig/network-scripts/ifcfg-* $ cat /etc/sysconfig/network-scripts/ifcfg-*
- Test by launching it as a T2, an instance type where it only needs xen driver.