Better Code

Swimming in Someone Else's Pool

Bootable, Mountable VM Images

| Comments

The Problem

There’s a classic pain point that anyone building disc images for virtual machines comes across. It’s easy to make a filesystem image in a file, and you can work with it easily by mounting it as a loopback filesystem. But to get KVM to boot it you’ve got to copy the kernel and initrd out, because GRUB can’t make a filesystem image bootable.

If you want a totally self-contained bootable image, you’ve got to mess with kpartx and losetup to make a disc image that includes partition information, just to keep the bootloader happy, and working with that is a pain because it doesn’t loopback mount.

I’ve stumbled across a way to have the best of both worlds: a single file that’s mountable as an ext4 loopback mount, but that’s also bootable.

The key to the trick is the layout of ext{2,3,4} filesystems. Unless you pass a mkfs-time parameter, the first datastructure the filesystem puts on the block device isn’t at the very start. It’s 1024 bytes in. The first 1024 bytes of an ext4-formatted filesystem are therefore available to us for a bootloader. I presume that’s the intent behind leaving the gap, but I can’t find anything in the ext2 docs to confirm it. Perhaps someone more knowledgeable can let me know.

Now, 1024 bytes might not sound like much, and it really isn’t. Grub2 needs a comparatively huge amount of blank space before the first partition to live in. What we need is a bootloader that’s going to live in the filesystem itself, with a teeny shim launcher that we can blit over the first sector of our disc image. I did start writing one, but the modern boot loader specs are just a little more than I found I wanted to tackle.

Fortunately, a little googling dropped extlinux into my lap. It works in precisely the way I’ve described: there’s a little shim MBR that it supplies, which loads just enough code into memory to navigate the ext4 structures and load the rest of its code.

The Script

I’ve knocked together a script which uses debootstrap and extlinux to make a bootable Debian Wheezy disc image from scratch. It’s shown below in full, then I’ll walk through the sections.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
#!/bin/bash

set -e

MOUNTPOINT=${1:-"mountpoint"}
DISC_IMG=${2:-"wheezy.img"}
ROOT_PASSWD=${ROOT_PASSWD:-"foobar"}
FS_CACHE=wheezy_cache


if [ ! -f "${FS_CACHE}.tar" ]; then
  mkdir -p ${FS_CACHE}
  sudo debootstrap wheezy ${FS_CACHE} http://mirror.bytemark.co.uk/debian
  sudo tar cf "${FS_CACHE}.tar" -C ${FS_CACHE} .
fi


truncate -s4G ${DISC_IMG}
/sbin/mkfs.ext4 -F ${DISC_IMG}
mkdir -p ${MOUNTPOINT}
sudo mount -o loop ${DISC_IMG} ${MOUNTPOINT}
sudo tar xf "${FS_CACHE}.tar" -C ${MOUNTPOINT}


in-chroot ${MOUNTPOINT} apt-get update
in-chroot ${MOUNTPOINT} apt-get install -yy --force-yes \
  linux-image-amd64 \
  extlinux

in-chroot ${MOUNTPOINT} extlinux --install /boot

KERNEL=$(cd ${MOUNTPOINT}/boot && ls vmlinuz* | tail -n1)
INITRD=$(cd ${MOUNTPOINT}/boot && ls initrd* | tail -n1)
sudo tee ${MOUNTPOINT}/boot/extlinux.conf > /dev/null <<EXTLINUXCONF
default linux
prompt 0
timeout 0

label linux
kernel /boot/${KERNEL}
append initrd=/boot/${INITRD} root=/dev/sda clocksource=kvm-clock
EXTLINUXCONF

sudo tee -a ${MOUNTPOINT}/etc/inittab > /dev/null <<ETCINITTAB
T0:23:respawn:/sbin/getty -L ttyS0 115200 vt102
ETCINITTAB

dd if=${MOUNTPOINT}/usr/lib/extlinux/mbr.bin of=${DISC_IMG} conv=notrunc

echo "root:${ROOT_PASSWD}" | in-chroot ${MOUNTPOINT} chpasswd

sudo umount ${MOUNTPOINT}

1. Preamble

1
2
3
4
5
6
7
8
#!/bin/bash

set -e

MOUNTPOINT=${1:-"mountpoint"}
DISC_IMG=${2:-"wheezy.img"}
ROOT_PASSWD=${ROOT_PASSWD:-"foobar"}
FS_CACHE=wheezy_cache

This is reading command-line parameters and setting up defaults. You call it passing a mountpoint directory name for it to create, and the filename for the filesystem. You can also set the root password for the system with the ROOT_PASSWD environment variable. So far, so obvious.

2. Build the debootstrap

1
2
3
4
5
if [ ! -f "${FS_CACHE}.tar" ]; then
  mkdir -p ${FS_CACHE}
  sudo debootstrap wheezy ${FS_CACHE}
  sudo tar cf "${FS_CACHE}.tar" -C ${FS_CACHE} .
fi

This bootstraps the Debian filesystem into a cache directory, then bundles it up into a tarfile. I did this for two reasons: firstly, debootstrap is slow. It’s got to pull down a whole bunch of packages, extract, and configure them. Several minutes’ work. Secondly, caching is kinder on the server we’re pulling from.

3. Construct the disc image file

1
2
3
4
5
truncate -s4G ${DISC_IMG}
/sbin/mkfs.ext4 -F ${DISC_IMG}
mkdir -p ${MOUNTPOINT}
sudo mount -o loop ${DISC_IMG} ${MOUNTPOINT}
sudo tar xf "${FS_CACHE}.tar" -C ${MOUNTPOINT}

truncate makes a sparse file of the correct size. Then we format it, mount it, and copy our debootstrapped filesystem into place.

4. Install boot-essential packages

1
2
3
4
in-chroot ${MOUNTPOINT} apt-get update
in-chroot ${MOUNTPOINT} apt-get install -yy --force-yes \
  linux-image-amd64 \
  extlinux

debootstrap doesn’t actually install a kernel, so we do that here. We also grab the extlinux package. This supplies the files we need for the next two stages…

5. Configure extlinux

1
in-chroot ${MOUNTPOINT} extlinux --install /boot

This tells extlinux to write a boot record.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
KERNEL=$(cd ${MOUNTPOINT}/boot && ls vmlinuz* | tail -n1)
INITRD=$(cd ${MOUNTPOINT}/boot && ls initrd* | tail -n1)
sudo tee ${MOUNTPOINT}/boot/extlinux.conf > /dev/null <<EXTLINUXCONF
default linux
prompt 0
timeout 0

label linux
kernel /boot/${KERNEL}
append initrd=/boot/${INITRD} root=/dev/sda clocksource=kvm-clock serial=tty0 console=ttyS0,115200n8
EXTLINUXCONF

sudo tee -a ${MOUNTPOINT}/etc/inittab > /dev/null <<ETCINITTAB
T0:23:respawn:/sbin/getty -L ttyS0 115200 vt102
ETCINITTAB

Here we find the kernel and initrd which apt-get installed for us, and build the config file extlinux expects to find. For some reason the bundled extlinux-update and extlinux-install scripts don’t work - I haven’t investigated why - so we write out the config file “by hand”. This config file makes extlinux launch straight into the kernel, with no prompt delay.

Also important: we set the serial console lines so we get useful console output when we launch kvm -nographic. The inittab line is just flung in at the end of the file in a particularly inelegant way; if you’re going to use these VMs in anger, I’d suggest cleaning up that file at the very least. You’ll also want to set up the virtio set of modules to load on boot if you’re going to do anything serious.

6. Write the Master Boot Record

1
dd if=${MOUNTPOINT}/usr/lib/extlinux/mbr.bin of=${DISC_IMG} conv=notrunc

Here we pull extlinux’s supplied MBR from the mounted filesystem, and write it back to the underlying device. Ordinarily this would be suicide. I’m relying heavily on the apparent fact that ext4 doesn’t ever touch the first two sectors of the disc to get away with this.

7. Set the root password and tidy up

1
2
3
echo "root:${ROOT_PASSWD}" | in-chroot ${MOUNTPOINT} chpasswd

sudo umount ${MOUNTPOINT}

Standard stuff here: we set the root password so we can log in once the system is booted, then clear up the mountpoint.

The Result

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
 $ sudo mount wheezy.img mountpoint -o loop
 $ cat mountpoint/etc/debian_version
7.1
 $ sudo umount mountpoint
 $ kvm -hda wheezy.img -m 256 -nographic < /dev/tty
[    0.000000] Initializing cgroup subsys cpuset
[    0.000000] Initializing cgroup subsys cpu
[    0.000000] Linux version 3.2.0-4-amd64 (debian-kernel@lists.debian.org) (gcc version 4.6.3 (Debian 4.6.3-14) ) #1 SMP Debian 3.2.46-1
...
...
[ ok ] Starting periodic command scheduler: cron.

Debian GNU/Linux 7 pandora ttyS0

pandora login:

Success!

Comments