Improving Linux Boot Times
Making computers boot fast is cool. How fast can we get a Raspberry Pi Zero W responding to pings over wifi?
Methodology
I’ve written a script that runs ping every 0.1 seconds and counts streaks of timeouts.
I’ll leave that running in the backgorund while I run ssh pi "echo b > /proc/sysrq-trigger" to trigger an immediate reboot.
This isn’t necessarily a perfect measurement of cold boot time, but I think it should give us a good basis for comparison.
Establishing a baseline
Building an image
Starting from Buildroot 2025.08.01 raspberrypi0w_defconfig, let’s make some modifications to enable wifi and ssh.
We’ll also enable mdev, a minimal, standalone alternative to systemd’s udev userspace device manager.
mdev will dynamically detect hardware and load appropriate drivers.
After building this image, you’ll need to add configuration files for dropbear, wpa_supplicant, and /etc/network/interfaces. I assigned a static IP since it will be quicker than acquiring a DHCP lease, and set up hooks to launch wpa_supplicant.
auto wlan0
iface wlan0 inet static
pre-up wpa_supplicant -D nl80211 -i wlan0 -c /etc/wpa_supplicant.conf -B
post-down killall -q wpa_supplicant
address 192.168.1.113
netmask 255.255.255.0
Let’s also do something crazy and enable the bootloader UART by patching the bootcode.bin binary on the Pi’s boot partition. I did this by hand on my laptop with:
LC_CTYPE=C sed -i '' 's/BOOT_UART=0/BOOT_UART=1/' bootcode.bin
First impressions
Booting this mostly-stock image yields 147 timeouts, suggesting a time-to-ping of 14.7 s.
Poking aroung on the image a bit:
# mount /dev/mmcblk0p1 /mnt
# ls -lh /mnt
total 10M
-rwxr-xr-x 1 root root 31.1K Nov 7 2025 bcm2708-rpi-zero-w.dtb
-rwxr-xr-x 1 root root 51.2K Nov 7 2025 bootcode.bin
-rwxr-xr-x 1 root root 65 Nov 7 2025 cmdline.txt
-rwxr-xr-x 1 root root 810 Nov 7 2025 config.txt
-rwxr-xr-x 1 root root 7.2K Nov 7 2025 fixup.dat
drwxr-xr-x 2 root root 30.0K Nov 7 2025 overlays
-rwxr-xr-x 1 root root 2.9M Nov 7 2025 start.elf
-rwxr-xr-x 1 root root 7.1M Nov 7 2025 zImage
# du -hs /lib/modules
21.8M /lib/modules
# lsmod
Module Size Used by
ipv6 512000 14
brcmfmac_wcc 12288 0
brcmfmac 327680 1 brcmfmac_wcc
brcmutil 16384 1 brcmfmac
cfg80211 880640 1 brcmfmac
raspberrypi_hwmon 12288 0
raspberrypi_gpiomem 12288 0
fixed 12288 0
hci_uart 45056 0
btbcm 20480 1 hci_uart
bluetooth 561152 3 hci_uart,btbcm
ecdh_generic 12288 1 bluetooth
ecc 40960 1 ecdh_generic
rfkill 28672 3 bluetooth,cfg80211
libaes 12288 1 bluetooth
uio_pdrv_genirq 12288 0
uio 20480 1 uio_pdrv_genirq
# dmesg | grep Run
[ 4.396567] Run /sbin/init as init process
# dmesg | grep brcmfmac
[ 7.408735] brcmfmac: F1 signature read @0x18000000=0x1541a9a6
[ 7.414928] brcmfmac: brcmf_fw_alloc_request: using brcm/brcmfmac43430-sdio for chip BCM43430/1
[ 7.429487] usbcore: registered new interface driver brcmfmac
[ 7.790636] brcmfmac: brcmf_c_process_txcap_blob: no txcap_blob available (err=-2)
[ 7.814734] brcmfmac: brcmf_c_preinit_dcmds: Firmware: BCM43430/1 wl0: Jul 19 2021 03:24:18 version 7.45.98 (TOB) (56df937 CY) FWID 01-8e14b897
[ 8.263505] brcmfmac: brcmf_cfg80211_set_power_mgmt: power save enabled
To summarize our results:
| PID 1 | fmac initial | fmac fw | fmac final | Ping |
|---|---|---|---|---|
| 4.397 s | 7.409 s | 7.815 s | 8.263 s | 14.7 s |
We will consider the kernel to have booted when it launches PID 1. We are also interested in various messages from brcmfmac, such as when the module has been loaded, when firmware has been loaded, and when power management is enabled. I believe this power management message corresponds to the wpa_supplicant hook in our interfaces file.
Considering the bootloader
If we hook an FTDI serial adapter up to the Pi’s UART, we can see bootloader messages. Here are the complete logs:
Raspberry Pi Bootcode
Read File: config.txt, 810
Read File: start.elf, 2990176 (bytes)
Read File: fixup.dat, 7324 (bytes)
MESS:00:00:00.907776:0: boot-part: 0 fs-type: 0
MESS:00:00:00.910603:0: boot-part: 0 fs-type: 3
MESS:00:00:00.920520:0: brfs: File read: /mfs/sd/config.txt
MESS:00:00:00.924899:0: brfs: File read: 810 bytes
MESS:00:00:01.059301:0: HDMI0:EDID error reading EDID block 0 attempt 0
MESS:00:00:01.162749:0: HDMI0:EDID error reading EDID block 0 attempt 1
MESS:00:00:01.266189:0: HDMI0:EDID error reading EDID block 0 attempt 2
MESS:00:00:01.369634:0: HDMI0:EDID error reading EDID block 0 attempt 3
MESS:00:00:01.473072:0: HDMI0:EDID error reading EDID block 0 attempt 4
MESS:00:00:01.576516:0: HDMI0:EDID error reading EDID block 0 attempt 5
MESS:00:00:01.679956:0: HDMI0:EDID error reading EDID block 0 attempt 6
MESS:00:00:01.783400:0: HDMI0:EDID error reading EDID block 0 attempt 7
MESS:00:00:01.886838:0: HDMI0:EDID error reading EDID block 0 attempt 8
MESS:00:00:01.990281:0: HDMI0:EDID error reading EDID block 0 attempt 9
MESS:00:00:01.996199:0: HDMI0:EDID giving up on reading EDID block 0
MESS:00:00:02.002138:0: brfs: File read: /mfs/sd/config.txt
MESS:00:00:02.007140:0: gpioman: gpioman_get_pin_num: pin LEDS_PWR_OK not defined
MESS:00:00:02.200431:0: gpioman: gpioman_get_pin_num: pin LEDS_PWR_OK not defined
MESS:00:00:02.206258:0: *** Restart logging
MESS:00:00:02.210134:0: brfs: File read: 810 bytes
MESS:00:00:02.312358:0: hdmi: HDMI0:EDID error reading EDID block 0 attempt 0
MESS:00:00:02.416330:0: hdmi: HDMI0:EDID error reading EDID block 0 attempt 1
MESS:00:00:02.520292:0: hdmi: HDMI0:EDID error reading EDID block 0 attempt 2
MESS:00:00:02.624262:0: hdmi: HDMI0:EDID error reading EDID block 0 attempt 3
MESS:00:00:02.728227:0: hdmi: HDMI0:EDID error reading EDID block 0 attempt 4
MESS:00:00:02.832197:0: hdmi: HDMI0:EDID error reading EDID block 0 attempt 5
MESS:00:00:02.936160:0: hdmi: HDMI0:EDID error reading EDID block 0 attempt 6
MESS:00:00:03.040129:0: hdmi: HDMI0:EDID error reading EDID block 0 attempt 7
MESS:00:00:03.144094:0: hdmi: HDMI0:EDID error reading EDID block 0 attempt 8
MESS:00:00:03.248064:0: hdmi: HDMI0:EDID error reading EDID block 0 attempt 9
MESS:00:00:03.254505:0: hdmi: HDMI0:EDID giving up on reading EDID block 0
MESS:00:00:03.357690:0: hdmi: HDMI0:EDID error reading EDID block 0 attempt 0
MESS:00:00:03.461660:0: hdmi: HDMI0:EDID error reading EDID block 0 attempt 1
MESS:00:00:03.565624:0: hdmi: HDMI0:EDID error reading EDID block 0 attempt 2
MESS:00:00:03.669593:0: hdmi: HDMI0:EDID error reading EDID block 0 attempt 3
MESS:00:00:03.773559:0: hdmi: HDMI0:EDID error reading EDID block 0 attempt 4
MESS:00:00:03.877529:0: hdmi: HDMI0:EDID error reading EDID block 0 attempt 5
MESS:00:00:03.981491:0: hdmi: HDMI0:EDID error reading EDID block 0 attempt 6
MESS:00:00:04.085460:0: hdmi: HDMI0:EDID error reading EDID block 0 attempt 7
MESS:00:00:04.189427:0: hdmi: HDMI0:EDID error reading EDID block 0 attempt 8
MESS:00:00:04.293397:0: hdmi: HDMI0:EDID error reading EDID block 0 attempt 9
MESS:00:00:04.299837:0: hdmi: HDMI0:EDID giving up on reading EDID block 0
MESS:00:00:04.305452:0: hdmi: HDMI:hdmi_get_state is deprecated, use hdmi_get_display_state instead
MESS:00:00:04.314197:0: HDMI0: hdmi_pixel_encoding: 162000000
MESS:00:00:04.319926:0: vec: vec_middleware_power_on: vec_base: 0x7e806000 rev-id 0x00002708 @ vec: 0x7e806100 @ 0x00000420 enc: 0x7e806060 @ 0x00000220 cgmsae: 0x7e80605c @ 0x00000000
MESS:00:00:04.346460:0: dtb_file 'bcm2708-rpi-zero-w.dtb'
MESS:00:00:04.353620:0: brfs: File read: /mfs/sd/bcm2708-rpi-zero-w.dtb
MESS:00:00:04.358536:0: Loaded 'bcm2708-rpi-zero-w.dtb' to 0x100 size 0x7c5d
MESS:00:00:04.379484:0: brfs: File read: 31837 bytes
MESS:00:00:04.384216:0: brfs: File read: /mfs/sd/overlays/overlay_map.dtb
MESS:00:00:04.419217:0: brfs: File read: 5947 bytes
MESS:00:00:04.422984:0: brfs: File read: /mfs/sd/config.txt
MESS:00:00:04.428808:0: brfs: File read: 810 bytes
MESS:00:00:04.445825:0: brfs: File read: /mfs/sd/overlays/miniuart-bt.dtbo
MESS:00:00:04.474365:0: Loaded overlay 'miniuart-bt'
MESS:00:00:04.528478:0: brfs: File read: 1566 bytes
MESS:00:00:04.532223:0: brfs: File read: /mfs/sd/cmdline.txt
MESS:00:00:04.537052:0: Read command line from file 'cmdline.txt':
MESS:00:00:04.542939:0: 'root=/dev/mmcblk0p2 rootwait console=tty1 console=ttyAMA0,115200'
MESS:00:00:04.566340:0: gpioman: gpioman_get_pin_num: pin EMMC_ENABLE not defined
MESS:00:00:04.636643:0: brfs: File read: 65 bytes
MESS:00:00:05.066760:0: brfs: File read: /mfs/sd/zImage
MESS:00:00:05.070270:0: Loaded 'zImage' to 0x8000 size 0x71a2f0
MESS:00:00:05.075922:0: Device tree loaded to 0x19be8000 (size 0x7f7e)
MESS:00:00:05.083212:0: uart: Set PL011 baud rate to 103448.300000 Hz
MESS:00:00:05.089849:0: uart: Baud rate change done...
MESS:00:00:05.093263:0: uart: Baud rate change done...
MESS:00:00:05.098734:0: gpioman: gpioman_get_pin_num: pin SDCARD_CONTROL_POWER not defined
MESS:00:00:05.106145:0: Watchdog stopped
MESS:00:00:05.109762:0: arm_loader: Starting ARM with 412MB
Let’s break this down. We see the system
- emitting a ton of HDMI EDID errors
- reading config.txt, apparently several times
- loading the device tree files into RAM
- reading the command line file
- loading the kernel to RAM
- loading the assembled device tree into the spot the kernel will presumably be looking for it
- enabling the uart
- and finally starting the kernel
To summarize our key measurements:
| HDMI final | dtb assembled | dtb + kernel loaded | final message |
|---|---|---|---|
| 4.314 s | 4.528 s | 5.076 s | 5.110 s |
Piecing it all together
Let’s measure bootloader logs, kernel logs, and time-to-ping.
From the bootloader we get:
| HDMI final | dtb assembled | dtb + kernel loaded | final message |
|---|---|---|---|
| 4.253 s | 4.477 s | 5.031 s | 5.058 s |
| 4.263 s | 4.477 s | 5.032 s | 5.059 s |
| 4.263 s | 4.477 s | 5.032 s | 5.059 s |
| HDMI final | dtb assembled | dtb + kernel loaded | final message |
|---|---|---|---|
| 4.260 s | 4.477 s | 5.032 s | 5.059 s |
It’s interesting to note that using our sysrq-trigger based method, we shave about 50 ms off the time to final bootloader message.
From Linux we see:
| PID 1 | fmac initial | fmac fw | fmac final |
|---|---|---|---|
| 4.453 s | 7.474 s | 7.882 s | 8.324 s |
| 4.454 s | 7.549 s | 7.948 s | 8.402 s |
| 4.454 s | 7.441 s | 7.860 s | 8.274 s |
| PID 1 | fmac initial | fmac fw | fmac final |
|---|---|---|---|
| 4.454 s | 7.488 s | 7.897 s | 8.333 s |
And attempting to reconcile our results:
| Ping | fmac + bootloader | mystery time |
|---|---|---|
| 15.0 s | 13.382 s | 1.6 s |
| 14.0 s | 13.461 s | 0.5 s |
| 13.9 s | 13.333 s | 0.6 s |
| Ping | fmac + bootloader | mystery time |
|---|---|---|
| 14.3 s | 13.392 s | 0.9 s |
Our measurements are accounting for approximately 94% of the time-to-ping. Since we don’t have continuity of measurement between the bootloader and the kernel, it’s hard to say whether there’s an unmeasured delay there. Another source of mystery time may be what happens after our final message from brcmfmac. It seems natural that in a noisy 802.11 environment like my apartment, there would be some variation in the time to establish an RF link.
I’m pretty happy with the amount of time-to-ping we’ve found an explanation for. I think this gives us a good starting point to optimize.
Optimizing
Bootloader optimizations
I had planned on stripping the kernel first, but the fact that our bootloader is spending three seconds doing nothing but emitting EDID error messages is too horrifying to ignore.
Looking at the Raspberry Pi documentation, I see that we may be able to avoid these EDID issues with hdmi_ignore_hotplug=1.
Our logs now start out much more reasonable:
Raspberry Pi Bootcode
Read File: config.txt, 696
Read File: start.elf, 2990176 (bytes)
Read File: fixup.dat, 7324 (bytes)
MESS:00:00:00.856936:0: boot-part: 0 fs-type: 0
MESS:00:00:00.859763:0: boot-part: 0 fs-type: 3
MESS:00:00:00.869888:0: brfs: File read: /mfs/sd/config.txt
MESS:00:00:00.874225:0: brfs: File read: 696 bytes
MESS:00:00:00.898114:0: brfs: File read: /mfs/sd/config.txt
MESS:00:00:00.902523:0: gpioman: gpioman_get_pin_num: pin LEDS_PWR_OK not defined
MESS:00:00:01.112375:0: gpioman: gpioman_get_pin_num: pin LEDS_PWR_OK not defined
MESS:00:00:01.118202:0: *** Restart logging
MESS:00:00:01.122078:0: brfs: File read: 696 bytes
MESS:00:00:01.126750:0: hdmi: HDMI:hdmi_get_state is deprecated, use hdmi_get_display_state instead
MESS:00:00:01.135356:0: HDMI0: hdmi_pixel_encoding: 162000000
Rebooting from software, I’m measuring:
| HDMI final | dtb assembled | dtb + kernel loaded | final message |
|---|---|---|---|
| 1.135 s | 1.349 s | 2.495 s | 2.528 s |
| 1.135 s | 1.349 s | 2.492 s | 2.526 s |
| 1.135 s | 1.350 s | 2.494 s | 2.528 s |
| HDMI final | dtb assembled | dtb + kernel loaded | final message |
|---|---|---|---|
| 1.135 s | 1.349 s | 2.494 s | 2.527 s |
Browsing the internet, I also see people talking about boot_delay with a default value of 1.
I tested setting boot_delay=0 explicitly and didn’t measure any meaningful differences in previously reported values.
I’ve seen various rumours around the web about disabling splash screens and HAT detection.
Casually testing these options didn’t affect the time to final bootloader message at all.
In fact, now that we’ve eliminated those EDID errors, the remaing fruit is not low hanging. There’s only 800 milliseconds between the final HDMI message, and the final bootloader message. Only 200 milliseconds are spent reading dtb files and overlays, and that’s with logging enabled. Once we disable serial logging in both the bootloader and the kernel, we won’t need to initialize the uart or load the corresponding overlay. The long pole in the tent is loading the kernel, so let’s start to address that.
Comparing our results to our baseline:
| HDMI final | % | dtb assembled | % | dtb + kernel loaded | % | final message | % |
|---|---|---|---|---|---|---|---|
| 4.258 s | 100 | 4.477 s | 100 | 5.031 s | 100 | 5.058 s | 100 |
| 1.135 s | 26.7 | 1.349 s | 30.1 | 2.494 s | 49.6 | 2.527 s | 50.0 |
If you look at the deltas between stages, it’s clear we haven’t improved anything other than time to final HDMI message. However, that’s enough to cut time to final bootloader message in half.
Kernel optimizations
Stripping out support for most kernel features gets us down to a 3.1M compressed kernel image.
| bootloader | PID 1 | fmac initial | fmac fw | fmac final | Ping |
|---|---|---|---|---|---|
| 2.283 s | 1.232 s | 2.640 s | 3.060 s | 3.420 s | 6.5 s |
| 2.285 s | 1.221 s | 2.620 s | 3.020 s | 3.399 s | 7.3 s |
| 2.287 s | 1.217 s | 2.612 s | 2.999 s | 3.410 s | 6.5 s |
| bootloader | PID 1 | fmac initial | fmac fw | fmac final | Ping |
|---|---|---|---|---|---|
| 2.285 s | 1.223 s | 2.624 s | 3.026 s | 3.410 s | 6.8 s |
Let’s look at the effect of kernel size on various measurements:
| size | % | bootloader | % | PID 1 | % |
|---|---|---|---|---|---|
| 7.1M | 100 | 2.527 s | 100 | 4.454 s | 100 |
| 3.1M | 43.6 | 2.285 s | 90.4 | 1.223 s | 27.5 |
It’s interesting to see that cutting the kernel size has had a superlinear effect on time to PID 1. We cut the size by 56%, and time to PID 1 by 72.5%.
Our bootloader gains look less impressive because we’re including time the bootloader spent on things other than loading the kernel. We said previously that loading the kernel into RAM was taking us around 500 ms, and we’ve cut that about in half.
We’re currently loading our wifi drivers as kernel modules. This seems to be necessary for the brcmfmac driver, since it is loading firmware stored on the rootfs. However, we can compile the rfkill and cfg80211 drivers directly into the kernel.
My line of reasoning is that we’re going to need to load rfkill and cfg80211 regardless, and loading it earlier in the boot process can only help us. Now we only need to load one module once we reach userspace, as opposed to three. This will come at the cost of increasing our kernel’s size, but it’s the kind of thing we need to measure to determine whether or not it’s beneficial.
Compiling this driver into the kernel increased our kernel’s size to 3.3M. Let’s see how that affects our measurements:
| bootloader | PID 1 | fmac initial | fmac fw | fmac final | Ping |
|---|---|---|---|---|---|
| 2.293 s | 1.257 s | 2.449 s | 2.860 s | 3.218 s | 6.2 s |
| 2.296 s | 1.211 s | 2.407 s | 2.806 s | 3.163 s | 6.3 s |
| 2.297 s | 1.232 s | 2.430 s | 2.815 s | 3.169 s | 6.5 s |
| bootloader | PID 1 | fmac initial | fmac fw | fmac final | Ping |
|---|---|---|---|---|---|
| 2.295 s | 1.233 s | 2.429 s | 2.827 s | 3.183 s | 6.3 s |
Our bootloader is taking a bit longer to load the kernel. The spread on those numbers is tight enough that it seems like more than noise. This makes sense. 0.2M may not seem like a lot, but that’s a 6% increase in size.
We’re averaging a bit longer to reach PID one, but we’re hitting times below and above our previous iteration so this measurement may just have some spread to it. Meanwhile we do seem to have achieved the objective of getting brcmfmac loaded faster, and this is reflected in our time-to-ping.
It would be beneficial to come back to this with a first-principles approach and really weigh tradeoffs for features against the time they may add to the boot process. There would also be benefits to more testing, since kernel options can have serious performance implications. It may be that some of the options we stripped out were “worth” more than they saved.
Disabling the console
Now we’ll run reverse our sed command earlier to disable UART logging in the bootloader.
We’ll also set the kernel command line to console=null, and add uart_enable=0 to config.txt.
After doing this, we don’t have any way to measure the bootloader performance.
| PID 1 | fmac initial | fmac fw | fmac final | Ping |
|---|---|---|---|---|
| 0.509 s | 1.685 s | 2.049 s | 2.419 s | 4.6 s |
| 0.496 s | 1.670 s | 2.039 s | 2.381 s | 6.6 s |
| 0.559 s | 1.732 s | 2.139 s | 2.443 s | 5.5 s |
| PID 1 | fmac initial | fmac fw | fmac final | Ping |
|---|---|---|---|---|
| 0.521 s | 1.696 s | 2.056 s | 2.414 s | 5.6 s |
We can see that the time between the start of kernel boot and PID 1 has been cut more than in half. However, since that accounts for the entirity of our 700 ms faster time-to-ping, we’re unable to attribute any speedup to disabling the bootloader UART. This would be an interesting spot to dig in further. If we really are still spending over two seconds in the bootloader, that’s a very significant part of our time-to-ping.
Userspace optimizations
The first question to ask about userspace is, why is it taking over 1 second from PID 1 launching, to brcmfmac beginning to load? Let’s consider how sysvinit style init systems work. The init process executes instructions from /etc/inittab, which in turn runs /etc/init.d/rcS. That script runs other scripts in /etc/init.d with the “start” argument.
Let’s look at /etc/inittab:
# Startup the system
::sysinit:/bin/mount -t proc proc /proc
::sysinit:/bin/mount -o remount,rw /
::sysinit:/bin/mkdir -p /dev/pts /dev/shm
::sysinit:/bin/mount -a
::sysinit:/bin/mkdir -p /run/lock/subsys
::sysinit:/sbin/swapon -a
null::sysinit:/bin/ln -sf /proc/self/fd /dev/fd
null::sysinit:/bin/ln -sf /proc/self/fd/0 /dev/stdin
null::sysinit:/bin/ln -sf /proc/self/fd/1 /dev/stdout
null::sysinit:/bin/ln -sf /proc/self/fd/2 /dev/stderr
::sysinit:/bin/hostname -F /etc/hostname
# now run any rc scripts
::sysinit:/etc/init.d/rcS
# Put a getty on the serial port
console::respawn:/sbin/getty -L console 0 vt100 # GENERIC_SERIAL
tty1::respawn:/sbin/getty -L tty1 0 vt100 # HDMI console
# Stuff to do for the 3-finger salute
#::ctrlaltdel:/sbin/reboot
# Stuff to do before rebooting
::shutdown:/etc/init.d/rcK
::shutdown:/sbin/swapoff -a
::shutdown:/bin/umount -a -r
We’re doing a lot of work in userspace before we even begin running rc scripts. That makes sense. Our rc scripts generally assume /proc is available, /etc/fstab has been processed, the system has a hostname. But we can get a jump start on loading brcmfmac by putting it at the top of the list here:
::sysinit:/sbin/modprobe brcmfmac_wcc
| PID 1 | fmac initial | fmac fw | fmac final | Ping |
|---|---|---|---|---|
| 0.563 s | 0.818 s | 1.197 s | 2.491 s | 5.7 s |
| 0.507 s | 0.762 s | 1.103 s | 2.469 s | 5.6 s |
| 0.509 s | 0.764 s | 1.104 s | 2.459 s | 5.6 s |
| PID 1 | fmac initial | fmac fw | fmac final | Ping |
|---|---|---|---|---|
| 0.526 s | 0.781 s | 1.123 s | 2.473 s | 5.6 s |
This sped up loading the brcmfmac driver and firmware, but didn’t do much to change the time of our final brcmfmac message or time-to-ping.
That’s because the final message is happening when /etc/init.d/S40networking runs ifup, and we haven’t done anything to make that happen sooner.
Let’s remove everything but S40networking and S50dropbear from /etc/init.d. We can also edit our inittab to only do the absolute minimum required for our interface scripts to work.
::sysinit:/sbin/modprobe brcmfmac_wcc
::sysinit:/bin/mkdir -p /dev/pts
::sysinit:/bin/mount -t proc proc /proc
::sysinit:/bin/mount -a
::sysinit:/etc/init.d/rcS
with /etc/fstab:
# <file system> <mount pt> <type> <options> <dump> <pass>
devpts /dev/pts devpts defaults,gid=5,mode=620,ptmxmode=0666 0 0
tmpfs /run tmpfs mode=0755,nosuid,nodev 0 0
sysfs /sys sysfs defaults 0 0
| PID 1 | fmac initial | fmac fw | fmac final | Ping |
|---|---|---|---|---|
| 0.409 s | 0.659 s | 0.951 s | 1.154 s | 4.2 s |
| 0.409 s | 0.659 s | 0.981 s | 1.145 s | 3.3 s |
| 0.409 s | 0.659 s | 0.949 s | 1.152 s | 5.0 s |
| PID 1 | fmac initial | fmac fw | fmac final | Ping |
|---|---|---|---|---|
| 0.409 s | 0.659 s | 0.961 s | 1.150 s | 4.2 s |
I have no idea how this made us reach PID 1 20% faster. If I ever figure it out, I’ll update the blog post.
It does look like we managed to get ifup happening sooner as we were hoping.
In the real world we probably would not want to remove all our other init scripts to make this happen, but we can imagine ways to do early initialization of the most critical services, and then come back to set up a typical userspace.
Further improvements
I have a few more ideas for improving boot time.
We aren’t getting timestamps from the bootloader after we turned off logging, but when we disabled logging we saw little evidence of substantial speed ups. It seems probable to me that we are still spending two seconds in the bootloader, so that’s ripe for optimization.
I’d like to experiment with an initramfs and/or squashfs, and see if either of those offer meaningful performance improvements over our ext4 rootfs.
I started to play around with overclocking the Pi, but that exposed some race conditions where the init system ran ifup before wlan0 had been created.
I found some interesting solutions based on /proc/sys/kernel/hotplug, using mdev, or just using a shell script.
I’d also like to experiment with adding the brcmfac firmware and in turn driver directly to the kernel. That could speed up load times and/or eliminate the need to handle race conditions in userspace.
Summary
Here’s a summary of our results:
| Name | bootldr | PID 1 | fmac initial | fmac fw | fmac final | Ping |
|---|---|---|---|---|---|---|
| intital | 5.059 s | 4.397 s | 7.409 s | 7.815 s | 8.263 s | 14.7 s |
| boot hdmi | 2.527 s | |||||
| strip kernel | 2.295 s | 1.233 s | 2.429 s | 2.827 s | 3.183 s | 6.3 s |
| no console | 0.521 s | 1.696 s | 2.056 s | 2.414 s | 5.6 s | |
| fmac early | 0.526 s | 0.781 s | 1.123 s | 2.473 s | 5.6 s | |
| strip init | 0.409 s | 0.659 s | 0.961 s | 1.150 s | 4.2 s |