Investigations into significantly reducing size of /boot/initrd.img files

(Originally created in 2019; refreshed and refined in 2021 and 2023)

The default action of Debian/Ubuntu initramfs-tools is to include almost all the available kernel modules and supporting firmware files in the initialramfs file-system regardless of if those modules or firmware are actually used by the system. This is only really required for installer or portable system images that aren't tied to a specific PC.

Typically this results in the /boot/initrd.img-* files being larger than 70MB. This has to be read into memory by the bootloader along with the kernel (/boot/vmlinuz-*) which can result in a significant delay during start-up for the read and then the kernel having to uncompress it into memory. The time taken to build the initrd.img also causes relatively long delays when installing linux-image-* package upgrades.

It is possible to reduce the file size to around 25MB or less with no loss of functionality with additional patches I've developed for initramfs-tools and the Linux kernel.

This has an added benefit of not causing out of space errors when installing new kernels when there is a separate /boot/ file-system that has a limited size - often less than 512MB.

Benefits of reduction

Here is an overview:

Statistics (initrd.img with kernel v6.2.4):
MODULES=  FIRMWARE_LOADED size      MOST DEP firmwares build-time
most      false           77117694           634       14.49
most      true            60302859 -22%        8       11.99
dep       false           42489938 -45%      606       6.84
dep       true            25704125 -66% -40%   8       6.35

How to install and use

I include here recent builds of the initramfs-tools packages which can be installed using dpkg -i. As the kernel patch is not currently in mainline (I'm currently working on getting it accepted) I also have recent kernel builds that include the patch. These are in the parent directory For each version of the kernel there are 3 packages to install: linux-image, linux-headers and linux-libc-dev.

ASCIInema demo

This is the original 2019 recording when I was experimenting with ideas and used PRUNE= which is now FIRMWARE_LOADED=.

Existing initramfs-tools configuration options

The list of files to be included is controlled by /etc/initramfs-tools/initramfs.conf and the parameter:

MODULES=most

By changing this to:

MODULES=dep

Only modules required by the system will be included. This can typically result in ~40% reduction in size.

Additional option

Adding a further option:

MODULES=loaded

can reduce the number of included modules still more, and has some advantages over MODULES=dep since it takes the list of currently loaded kernel modules - which may include modules manually loaded by the system operator for specific purposes. Of course this could also be provided for with an entry in the static list /etc/initramfs-tools/modules.

However this will not provide much gain over MODULES=dep and so it is debatable if this is worth using.

Firmware

Many firmware files (from /lib/firmware/) are included although they don't match the system's hardware. This is due to kernel modules that manage many different devices declaring all the firmware files they may need and all those firmware files are included in the initialramfs even when they do no match the installed hardware.

By only including the firmware files required by the hardware this can reduce the size by ~20%, a massive 66% combined with MODULES=dep, and 40% compared with only MODULES=dep.

The Linux kernel does not currently provide a way to identify which firmware files have been loaded even though it has a dedicated firmware_loader facility that modules use. By adding additional functionality to this code it is possible for userspace to obtain a list of the loaded firmware files and use that to control which files are included in the initialramfs.

I've investigated three approaches to adding interfaces to the Linux firmware loader. For now I've put aside the first two as being too complicated and invasive:

  1. Add a sysfs interface under /sys/firmware/ of the form /sys/firmware/module/<module-name>/firmware_loaded
  2. Add a procfs interface as /proc/firmware_loaded with a single text file list of the form "<module-name> <firmware-file>"
  3. Write to the kernel log using pr_info() and dev_info() with the text "Firmware loaded: <firmware_file>"

The current experiment uses option (3) and despite reservations initially has proved to be reliable, simple, and minimally invasive since 2019. The kernel patch is included here as linux-firmware-loaded.patch

This adds the control variable FIRMWARE_LOADED to /etc/initramfs-tools/initramfs.conf which when set to true will search the kernel log for messages of the form Firmware loaded: path/to/firmware.file and only allow inclusion of those files. For GPU drivers such as amdgpu that lists over 500 firmware files that would usually be included, this will typically reduce that to less than 10.

Additionally, when this is enabled, a static list in /etc/initramfs-tools/firmware is also used to force inclusion of specific firmware files (if they are installed on the host). This ensures the system administrator can ensure these files are always included regardless of what the current kernel has reported to be loaded.

Plymouth package

There are still many modules and firmware files included that are not required - typically by other packages that add hook scripts. One such is the plymouth package (the boot-time splash-screen/input handler) hook script that forces all DRM modules for every GPU to be included. Removing this shotgun-approach reduces the size considerably.

Removing modules (and therefore firmware files) not required during initialramfs operation

There is scope to identify and ommit kernel modules that are not required to mount the real root file-system or interact with the user (for accepting a LUKS passphrase, or in an emergency), which is the only task the initialramfs is supposed to be doing.

This would entail adding a script that can capture the list of loaded modules just before the /init script switches to the real root file-system and using that when the initialramfs is built.