Better Code

Swimming in Someone Else's Pool

How I Ruby, Part 2a: Deployment (Ruby)

| Comments

In this post I’m going to show you how use ruby-install to build a Debian package of MRI itself, and put together a trivial apt repository so you can serve it on your local network. Why do this? First, you aren’t dependent on ftp.ruby-lang.org being available when you want to deploy. Second, that you don’t waste time during your deployment rebuilding a ruby binary. I’ve seen builds of ruby 2.0.0 take 10 minutes or so, and it’s no fun waiting for that when you’ve got an urgent redeploy waiting.

You should note that the packages we’ll build here are the simplest possible packages to get a ruby binary installed on a Debian system. They are the base minimum required to get that one job done. They bear only a passing resemblance to the sort of quality of package you’ll find either in Debian itself, or in any of the other public repositories offering Ruby packages (like John Leach’s excellent Ubuntu packages, over here). You should use these strictly internally, since we cut several corners to build the simplest packages that can possibly work.

As before, I’m assuming you’re building on and deploying to Debian Stable, but the same instructions should work on Ubuntu. You’ll need ruby-install installed, along with the dpkg-dev package.

Here’s the script which does the bulk of the work. I’d suggest pasting this into an executable file called ruby-install-deb somewhere handy on your $PATH:

ruby-install-deb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
#!/usr/bin/env ruby

require 'fileutils'


def run( argv )
  ruby_version=argv.shift or fail "Need an MRI version number"

  making_dirs( ruby_version ) do
    make_makefile( ruby_version )
    make_debian
    add_depends
  end
end


def making_dirs( ruby_version )
  proj_name = "ruby-install-ruby"
  FileUtils.mkdir_p proj_name

  Dir.chdir proj_name do
    pack_name = "#{proj_name}#{ruby_version.tr("-", '')}-1"
    FileUtils.mkdir_p pack_name
    Dir.chdir pack_name do
      yield
    end
  end
end


def makefile( ruby_version )
  makefile=<<-MAKEFILE
DESTDIR?=/
RUBY_VERSION?=#{ruby_version}
RUBY_ROOT=/opt/rubies/ruby-$(RUBY_VERSION)
RUBY_PATH=$(RUBY_ROOT)/bin
RUBY_BIN=$(RUBY_PATH)/ruby

all: root/$(RUBY_BIN)

root/$(RUBY_BIN):
 mkdir -p root
 DESTDIR="$$(cd root/; pwd)" ruby-install -i "$(RUBY_ROOT)" ruby $(RUBY_VERSION)

install:
 mkdir -p $(DESTDIR)
 cp -a root/* $(DESTDIR)

clean:
 rm -rf root/

.PHONY: all install
  MAKEFILE
end


def make_makefile( ruby_version )
  File.open("Makefile", "wb") do |f|
    f.puts( makefile( ruby_version ) )
  end
end


def add_depends
  dependencies = `grep apt: /usr/share/ruby-install/ruby/dependencies.txt | sed 's/apt: //'`.strip.split(/\s+/)
  dependencies.delete "build-essential"
  dep_string = dependencies.join(", ")
  cmd = "sed -i '/^Depends:/s|$|, #{dep_string}|' debian/control"
  system cmd
end


def make_debian
  system "/bin/bash -c 'echo | dh_make --copyright gpl2 --native --single'"
  system "mkdir -p debian.keep"
  system "/bin/bash -c 'cp debian/{rules,control,compat,changelog} debian.keep/'"
  system "rm -rf debian"
  system "mv debian.keep debian"
end


if $0 == __FILE__
  run ARGV
end

You run it like this, specifying the MRI version you want to package on the command-line:

””
1
$ ruby-install-deb 2.0.0-p195

This will build a .deb file at ruby-install-deb/ruby-install-ruby2.0.0p195_1_amd64.deb. If you just want to see how to host this package inside your network and don’t want to bother with the details of how the above script works, skip to “Building a repository” below.

Broken down into its component parts, there isn’t much to the script above. We make a directory to build ruby in, construct a Makefile which knows how to use ruby-install, strap in the few parts that define a Debian package using that Makefile, and finally add the runtime dependencies.

I’ll run through the individual parts to highlight some of the details.

1: Top-level sequence

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
#!/usr/bin/env ruby

require 'fileutils'


def run( argv )
  ruby_version=argv.shift or fail "Need an MRI version number"

  making_dirs( ruby_version ) do
    make_makefile( ruby_version )
    make_debian
    add_depends
  end
end

...


if $0 == __FILE__
  run ARGV
end

Hopefully this is self-explanatory. It’s a literal transliteration of the previous paragraph into Ruby. The one thing worth noting here is that both the directory name and the Makefile depend on the ruby version we’re building. This is due to the vagaries of Debian packaging, which I’ll point out more of as we go along.

2: Directory Structure

1
2
3
4
5
6
7
8
9
10
11
12
def making_dirs( ruby_version )
  proj_name = "ruby-install-ruby"
  FileUtils.mkdir_p proj_name

  Dir.chdir proj_name do
    pack_name = "#{proj_name}#{ruby_version.tr("-", '')}-1"
    FileUtils.mkdir_p pack_name
    Dir.chdir pack_name do
      yield
    end
  end
end

This might look a little odd, but it’s mostly dictated by how building a Debian package works. First, we create a top-level directory to do all our work in. The name of this directory isn’t important.

Next, we construct a nested directory where we’re going to keep the actual package metadata. The name of this directory is important, since the Debian tooling is going to use it to pick up both the package name and the package version. The scheme I’ve chosen here will give us packages which can, in principle, allow you to have more than one ruby version installed at once. Here’s an example.

Say we generate a package for ruby 2.0.0-p247. The name of the working directory we’ll generate will be ruby-install-ruby2.0.0p247-1. When we generate our debian package metadata in a moment, the tool we use to do it will pick out ruby-install-ruby2.0.0p247 as the package name, and 1 as the package version. If we hadn’t dropped the hyphens with #tr(), and instead gone for ruby-install-ruby-2.0.0-p247, the package we generated would be called ruby-install-ruby, and the package version would be 2.0.0-p247. That would stop us from having more than one ruby version installed at a time, but it would also complicate deploying an updated package since as soon as we uploaded a later ruby version, an apt-get update on any host we’d previously installed to would grab the later version.

3: Makefile

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
def makefile( ruby_version )
  makefile=<<-MAKEFILE
DESTDIR?=/
RUBY_VERSION?=#{ruby_version}
RUBY_ROOT=/opt/rubies/ruby-$(RUBY_VERSION)
RUBY_PATH=$(RUBY_ROOT)/bin
RUBY_BIN=$(RUBY_PATH)/ruby

all: root/$(RUBY_BIN)

root/$(RUBY_BIN):
 mkdir -p root
 DESTDIR="$$(cd root/; pwd)" ruby-install -i "$(RUBY_ROOT)" ruby $(RUBY_VERSION)

install:
 mkdir -p $(DESTDIR)
 cp -a root/* $(DESTDIR)

clean:
 rm -rf root/

.PHONY: all install
  MAKEFILE
end

def make_makefile( ruby_version )
  File.open("Makefile", "wb") do |f|
    f.puts( makefile( ruby_version ) )
  end
end

The #makefile() method generates a Makefile which defines a make all task that calls ruby-install to build ruby. It also contains a make install task to copy the built binaries into place.

When we generate the package metadata, it’ll rely on those two tasks.

The variables at the top of the generated Makefile are worth looking at briefly.

DESTDIR is used by the Debian package building system so the make install task doesn’t actually install the package we’re building to the system we’re building it on. DESTDIR is specified in the GNU coding standards, and all Debian package Makefiles should support it.

Note that we set a different DESTDIR variable to pass to ruby-install in the root/$(RUBY_BIN) task. That DESTDIR is passed to ruby’s own Makefile, so that it doesn’t have the location it’s built at hardwired in.

I’m using the RUBY_* variables to point at the location where ruby-install will drop our compiled ruby binary, so that successive make invocations won’t redo unnecessary work.

4: Dependencies

1
2
3
4
5
6
7
def add_depends
  dependencies = `grep apt: /usr/share/ruby-install/ruby/dependencies.txt | sed 's/apt: //'`.strip.split(/\s+/)
  dependencies.delete "build-essential"
  dep_string = dependencies.join(", ")
  cmd = "sed -i '/^Depends:/s|$|, #{dep_string}|' debian/control"
  system cmd
end

ruby-install helpfully includes a list of the system packages that ruby depends on, in a format we can almost just drop into our Debian package definition. The gotcha is that it lists build dependencies, when we actually want runtime dependencies. Since we don’t want a compiler and associated gubbins installed when we apt-get install our shiny new ruby package, we have to drop the build-essential dependency. Fortunately for us, in the case of all the other dependencies, the runtime dependency is listed as a dependency of the build dependency so that, for instance, where /usr/share/ruby-install/ruby/dependencies.txt lists libreadline-dev, libreadline-dev will depend on libreadline6.

There is a small catch to listing dependencies like this. Once we’ve built the packages, they have to be installed with apt-get install --no-install-recommends. This is because one of the -dev packages recommends g++, which we’re trying not to include. If you prefer not to rely on --no-install-recommends, you could replace the first two lines of #add_depends with:

1
  dependencies = %w{zlib1g libyaml-0-2 libssl1.0.0 libgdbm3 libreadline6 libncurses5 libffi5}

I don’t like hardcoded lists like that, but it’s a workable approach.

5: Debianise

The next step is to generate the actual package metadata to turn our humble Makefile into a buildable package.

The files involved are a little fiddly, so we cheat. All we want is a package that’s buildable, rather than one that we could hope to submit upstream, so we can lightly abuse the Debian packaging toolkit to make our lives easier.

1
2
3
4
5
6
7
def make_debian
  system "/bin/bash -c 'echo | dh_make --copyright gpl2 --native --single'"
  system "mkdir -p debian.keep"
  system "/bin/bash -c 'cp debian/{rules,control,compat,changelog} debian.keep/'"
  system "rm -rf debian"
  system "mv debian.keep debian"
end

The critical line is the first, the rest is housekeeping. dh_make is a rather nice little debhelper tool - install it with apt-get install dh-make - which exists to set up the boilerplate you need for a Debian package.

It’s worth taking a look at the generated files. They’ll be in, for instance, the ruby-install-ruby/ruby-install-ruby2.0.0p195/debian directory. control and changelog in specific are worth your time, as they both contain fields you can edit to add useful cosmetic information to the final package.

The remainder of this method is concerned with cleaning out a whole load of example configuration files that dh_make generates which we simply don’t need.

Build it

I’ve got the above script saved to ~/bin/ruby-install-deb. I’m going to build a package for ruby 2.0.0-p195. Here’s what I do, and the console output:

1
2
3
4
5
6
7
8
9
10
$ ruby-install-deb 2.0.0-p195
Maintainer name  : Alex Young
Email-Address    : alex@blackkettle.org
Date             : Mon, 26 Aug 2013 16:19:47 +0100
Package Name     : ruby-install-ruby2.0.0p195
Version          : 1
License          : gpl2
Type of Package  : Single
Hit <enter> to confirm: Done. Please edit the files in the debian/ subdirectory now. You should also
check that the ruby-install-ruby2.0.0p195 Makefiles install into $DESTDIR and not in / .

You can safely ignore the Please edit... message, since we know the Makefile we’ve generated follows the rules.

1
2
3
4
5
6
7
$ ls
ruby-install-ruby
$ cd ruby-install-ruby
$ ls
ruby-install-ruby2.0.0p195-1
$ cd ruby-install-ruby2.0.0p195-1
$ dpkg-buildpackage -us -uc

This dpkg-buildpackage command is what triggers our ruby-install build. The -us -uc flags are to do with package signing, saying, in short, don’t do it. This is another place packages for public consumption would differ from these that we’re building here.

Since ruby-install calls sudo, I need to authenticate; then after a while, I get the output:

1
2
3
4
5
6
7
8
9
...
   dh_builddeb
dpkg-deb: building package `ruby-install-ruby2.0.0p195' in `../ruby-install-ruby2.0.0p195_1_amd64.deb'.
 dpkg-genchanges  >../ruby-install-ruby2.0.0p195_1_amd64.changes
dpkg-genchanges: including full source code in upload
 dpkg-source --after-build ruby-install-ruby2.0.0p195-1
dpkg-buildpackage: full upload; Debian-native package (full source is included)

$

And it’s built! Now, it’s dropped packages in the directory above us, so let’s take a look at what it did:

1
2
3
4
5
6
7
$ cd ..
$ ls -1
ruby-install-ruby2.0.0p195-1
ruby-install-ruby2.0.0p195_1_amd64.changes
ruby-install-ruby2.0.0p195_1_amd64.deb
ruby-install-ruby2.0.0p195_1.dsc
ruby-install-ruby2.0.0p195_1.tar.gz

The .deb is the only file we strictly need.

Building a repository

Having packages is only half the battle. To make the packages easy to install at deploy-time, and to make dependency resolution work for us, we need to set up an apt repository. While we’re at it, we’ll build a couple more ruby packages, to see how they fit together.

Back in the our root directory, make a rubies file to list which ruby binaries we want to build. Mine looks like this:

1
2
3
4
5
$ cat rubies
1.9.3-p429
2.0.0-p195
2.0.0-p247
$

Now, we can set up a debian directory for each of these like so:

1
2
3
4
5
6
7
$ cat rubies | xargs -n1 bin/ruby-install-deb
...
$ ls -1 ruby-install-ruby
ruby-install-ruby1.9.3p429-1
ruby-install-ruby2.0.0p195-1
ruby-install-ruby2.0.0p247-1
$

So that’s given us three package definitions to play with. Here’s a small script to build them all, which I’ve got saved to ~/bin/ruby-install-deb-build:

ruby-install-deb-build
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
#!/usr/bin/env ruby

debdirs = Dir['ruby-install-ruby/*/debian'].
  select{|d| File.directory?( d )}.
  map{|d| File.dirname( d )}

failures = debdirs.each_with_object([]) do |debdir,failures|
  success = Dir.chdir( debdir ) do
    system "dpkg-buildpackage -us -uc"
  end
  failures << debdir unless success
end

unless failures.empty?
  $stderr.puts failures.map {|debdir| "Building #{debdir} failed."}
  exit 1
end

This simply iterates over the debian directories it finds, and issues a dpkg-buildpackage for each one. Unfortunately these can’t run in parallel, because ruby-install calls apt-get install and that’s guaranteed to fail if you run it more than once at a time.

Nevertheless, once it’s finished, you can see what it has built:

ruby-install-deb-build
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
$ ls -1 ruby-install-ruby
ruby-install-ruby1.9.3p429-1
ruby-install-ruby1.9.3p429_1_amd64.changes
ruby-install-ruby1.9.3p429_1_amd64.deb
ruby-install-ruby1.9.3p429_1.dsc
ruby-install-ruby1.9.3p429_1.tar.gz
ruby-install-ruby2.0.0p195-1
ruby-install-ruby2.0.0p195_1_amd64.changes
ruby-install-ruby2.0.0p195_1_amd64.deb
ruby-install-ruby2.0.0p195_1.dsc
ruby-install-ruby2.0.0p195_1.tar.gz
ruby-install-ruby2.0.0p247-1
ruby-install-ruby2.0.0p247_1_amd64.changes
ruby-install-ruby2.0.0p247_1_amd64.deb
ruby-install-ruby2.0.0p247_1.dsc
ruby-install-ruby2.0.0p247_1.tar.gz

So now we’ve got a bunch of packages. We could stop here. If we did, to install the packages we’ve just built, we’d need to copy the .deb to our new host, and use a tool like gdebi to install the .deb and its dependencies.

Let’s not do that. It’s better to put together an apt repository to act as a local mirror, and to let apt-get install our dependencies as normal.

To put together a minimal repository, all we need to do is gather our files together, build an index, and upload the files and index to a web server. The repository can be served by a purely static server, and doesn’t need any dynamic server support at all.

What we’re building here is what Debian calls a “trivial” repository. Technically they’re deprecated because they don’t support some apt features, but they’re fine for our purposes.

Here’s how we gather the files and build the index:

ruby-install-deb-build
1
2
3
4
$ mkdir repo
$ mv ruby-install-ruby/*.deb repo/
$ cd repo
$ dpkg-scanpackages . /dev/null | gzip -9c > Packages.gz

The dpkg-scanpackages step will give you a couple of warnings. You can safely ignore them.

I’ve explicitly ignored the .dsc, .changes and .tar.gz files here. Again, they’re needed for specific apt features, and if all we want to do is build a repository to let us install ruby onto a server of our choice, we don’t need them.

All that’s left now is to upload the contents of repo/ to a convenient webserver. A fresh Debian stable VM with Apache or nginx apt-get installed will do fine here:

ruby-install-deb-build
1
2
3
4
# assuming you've made the `/var/www/ruby-install` directory
# and can upload to it...
$ rsync -az repo/* webserver:/var/www/ruby-install/
$ ssh webserver chown -R www-data: /var/www/ruby-install

Installing from the repository

Ok. So we’ve built packages, made a repository, and made it available. Now we’re ready to actually use the package in deployment. Assuming your new machine is called new-vm, here’s how you do that by hand.

ruby-install-deb-build
1
2
3
$ ssh -t root@new-vm
root@new-vm:~# echo "deb http://webserver/ruby-install ./" > /etc/apt/sources.list.d/ruby-install.list
root@new-vm:~# apt-get update

At this point, new-vm should know about the packages we’ve built. Let’s check:

ruby-install-deb-build
1
2
3
4
root@new-vm:~# apt-cache search ruby-install-ruby --names-only
ruby-install-ruby1.9.3p429 - <insert up to 60 chars description>
ruby-install-ruby2.0.0p195 - <insert up to 60 chars description>
ruby-install-ruby2.0.0p247 - <insert up to 60 chars description>

If you don’t like having the filler info there as the package description, you can edit it in debian/control before building the packages.

Anyway, now apt knows about the packages, and we should be able to install them. We will be asked if they’re safe to install without verification; since they (and the repository) are unsigned, just say “yes”.

ruby-install-deb-build
1
2
3
4
5
6
7
8
9
10
11
root@new-vm:~# apt-get install ruby-install-ruby1.9.3p429 ruby-install-ruby2.0.0p195 --no-install-recommends
...
Setting up ruby-install-ruby1.9.3p429 (1) ...
Setting up ruby-install-ruby2.0.0p195 (1) ...
root@new-vm:~# ls /opt/rubies
ruby-1.9.3-p429
ruby-2.0.0-p195
root@new-vm:~# /opt/rubies/ruby-1.9.3-p429/bin/ruby -v
ruby 1.9.3p429 (2013-05-15 revision 40747) [x86_64-linux]
root@new-vm:~# /opt/rubies/ruby-2.0.0-p195/bin/ruby -v
ruby 2.0.0p195 (2013-05-14 revision 40734) [x86_64-linux]

Hooray! For me, on a fresh local Wheezy VM with 256MB of RAM, apt-get installing a single ruby-install-ruby package, including dependencies, takes a minute. That’s better than compiling from source every time, by quite a long way.

Bootable, Mountable VM Images

| Comments

The Problem

There’s a classic pain point that anyone building disc images for virtual machines comes across. It’s easy to make a filesystem image in a file, and you can work with it easily by mounting it as a loopback filesystem. But to get KVM to boot it you’ve got to copy the kernel and initrd out, because GRUB can’t make a filesystem image bootable.

If you want a totally self-contained bootable image, you’ve got to mess with kpartx and losetup to make a disc image that includes partition information, just to keep the bootloader happy, and working with that is a pain because it doesn’t loopback mount.

I’ve stumbled across a way to have the best of both worlds: a single file that’s mountable as an ext4 loopback mount, but that’s also bootable.

The key to the trick is the layout of ext{2,3,4} filesystems. Unless you pass a mkfs-time parameter, the first datastructure the filesystem puts on the block device isn’t at the very start. It’s 1024 bytes in. The first 1024 bytes of an ext4-formatted filesystem are therefore available to us for a bootloader. I presume that’s the intent behind leaving the gap, but I can’t find anything in the ext2 docs to confirm it. Perhaps someone more knowledgeable can let me know.

Now, 1024 bytes might not sound like much, and it really isn’t. Grub2 needs a comparatively huge amount of blank space before the first partition to live in. What we need is a bootloader that’s going to live in the filesystem itself, with a teeny shim launcher that we can blit over the first sector of our disc image. I did start writing one, but the modern boot loader specs are just a little more than I found I wanted to tackle.

Fortunately, a little googling dropped extlinux into my lap. It works in precisely the way I’ve described: there’s a little shim MBR that it supplies, which loads just enough code into memory to navigate the ext4 structures and load the rest of its code.

The Script

I’ve knocked together a script which uses debootstrap and extlinux to make a bootable Debian Wheezy disc image from scratch. It’s shown below in full, then I’ll walk through the sections.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
#!/bin/bash

set -e

MOUNTPOINT=${1:-"mountpoint"}
DISC_IMG=${2:-"wheezy.img"}
ROOT_PASSWD=${ROOT_PASSWD:-"foobar"}
FS_CACHE=wheezy_cache


if [ ! -f "${FS_CACHE}.tar" ]; then
  mkdir -p ${FS_CACHE}
  sudo debootstrap wheezy ${FS_CACHE} http://mirror.bytemark.co.uk/debian
  sudo tar cf "${FS_CACHE}.tar" -C ${FS_CACHE} .
fi


truncate -s4G ${DISC_IMG}
/sbin/mkfs.ext4 -F ${DISC_IMG}
mkdir -p ${MOUNTPOINT}
sudo mount -o loop ${DISC_IMG} ${MOUNTPOINT}
sudo tar xf "${FS_CACHE}.tar" -C ${MOUNTPOINT}


in-chroot ${MOUNTPOINT} apt-get update
in-chroot ${MOUNTPOINT} apt-get install -yy --force-yes \
  linux-image-amd64 \
  extlinux

in-chroot ${MOUNTPOINT} extlinux --install /boot

KERNEL=$(cd ${MOUNTPOINT}/boot && ls vmlinuz* | tail -n1)
INITRD=$(cd ${MOUNTPOINT}/boot && ls initrd* | tail -n1)
sudo tee ${MOUNTPOINT}/boot/extlinux.conf > /dev/null <<EXTLINUXCONF
default linux
prompt 0
timeout 0

label linux
kernel /boot/${KERNEL}
append initrd=/boot/${INITRD} root=/dev/sda clocksource=kvm-clock
EXTLINUXCONF

sudo tee -a ${MOUNTPOINT}/etc/inittab > /dev/null <<ETCINITTAB
T0:23:respawn:/sbin/getty -L ttyS0 115200 vt102
ETCINITTAB

dd if=${MOUNTPOINT}/usr/lib/extlinux/mbr.bin of=${DISC_IMG} conv=notrunc

echo "root:${ROOT_PASSWD}" | in-chroot ${MOUNTPOINT} chpasswd

sudo umount ${MOUNTPOINT}

1. Preamble

1
2
3
4
5
6
7
8
#!/bin/bash

set -e

MOUNTPOINT=${1:-"mountpoint"}
DISC_IMG=${2:-"wheezy.img"}
ROOT_PASSWD=${ROOT_PASSWD:-"foobar"}
FS_CACHE=wheezy_cache

This is reading command-line parameters and setting up defaults. You call it passing a mountpoint directory name for it to create, and the filename for the filesystem. You can also set the root password for the system with the ROOT_PASSWD environment variable. So far, so obvious.

2. Build the debootstrap

1
2
3
4
5
if [ ! -f "${FS_CACHE}.tar" ]; then
  mkdir -p ${FS_CACHE}
  sudo debootstrap wheezy ${FS_CACHE}
  sudo tar cf "${FS_CACHE}.tar" -C ${FS_CACHE} .
fi

This bootstraps the Debian filesystem into a cache directory, then bundles it up into a tarfile. I did this for two reasons: firstly, debootstrap is slow. It’s got to pull down a whole bunch of packages, extract, and configure them. Several minutes’ work. Secondly, caching is kinder on the server we’re pulling from.

3. Construct the disc image file

1
2
3
4
5
truncate -s4G ${DISC_IMG}
/sbin/mkfs.ext4 -F ${DISC_IMG}
mkdir -p ${MOUNTPOINT}
sudo mount -o loop ${DISC_IMG} ${MOUNTPOINT}
sudo tar xf "${FS_CACHE}.tar" -C ${MOUNTPOINT}

truncate makes a sparse file of the correct size. Then we format it, mount it, and copy our debootstrapped filesystem into place.

4. Install boot-essential packages

1
2
3
4
in-chroot ${MOUNTPOINT} apt-get update
in-chroot ${MOUNTPOINT} apt-get install -yy --force-yes \
  linux-image-amd64 \
  extlinux

debootstrap doesn’t actually install a kernel, so we do that here. We also grab the extlinux package. This supplies the files we need for the next two stages…

5. Configure extlinux

1
in-chroot ${MOUNTPOINT} extlinux --install /boot

This tells extlinux to write a boot record.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
KERNEL=$(cd ${MOUNTPOINT}/boot && ls vmlinuz* | tail -n1)
INITRD=$(cd ${MOUNTPOINT}/boot && ls initrd* | tail -n1)
sudo tee ${MOUNTPOINT}/boot/extlinux.conf > /dev/null <<EXTLINUXCONF
default linux
prompt 0
timeout 0

label linux
kernel /boot/${KERNEL}
append initrd=/boot/${INITRD} root=/dev/sda clocksource=kvm-clock serial=tty0 console=ttyS0,115200n8
EXTLINUXCONF

sudo tee -a ${MOUNTPOINT}/etc/inittab > /dev/null <<ETCINITTAB
T0:23:respawn:/sbin/getty -L ttyS0 115200 vt102
ETCINITTAB

Here we find the kernel and initrd which apt-get installed for us, and build the config file extlinux expects to find. For some reason the bundled extlinux-update and extlinux-install scripts don’t work - I haven’t investigated why - so we write out the config file “by hand”. This config file makes extlinux launch straight into the kernel, with no prompt delay.

Also important: we set the serial console lines so we get useful console output when we launch kvm -nographic. The inittab line is just flung in at the end of the file in a particularly inelegant way; if you’re going to use these VMs in anger, I’d suggest cleaning up that file at the very least. You’ll also want to set up the virtio set of modules to load on boot if you’re going to do anything serious.

6. Write the Master Boot Record

1
dd if=${MOUNTPOINT}/usr/lib/extlinux/mbr.bin of=${DISC_IMG} conv=notrunc

Here we pull extlinux’s supplied MBR from the mounted filesystem, and write it back to the underlying device. Ordinarily this would be suicide. I’m relying heavily on the apparent fact that ext4 doesn’t ever touch the first two sectors of the disc to get away with this.

7. Set the root password and tidy up

1
2
3
echo "root:${ROOT_PASSWD}" | in-chroot ${MOUNTPOINT} chpasswd

sudo umount ${MOUNTPOINT}

Standard stuff here: we set the root password so we can log in once the system is booted, then clear up the mountpoint.

The Result

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
 $ sudo mount wheezy.img mountpoint -o loop
 $ cat mountpoint/etc/debian_version
7.1
 $ sudo umount mountpoint
 $ kvm -hda wheezy.img -m 256 -nographic < /dev/tty
[    0.000000] Initializing cgroup subsys cpuset
[    0.000000] Initializing cgroup subsys cpu
[    0.000000] Linux version 3.2.0-4-amd64 (debian-kernel@lists.debian.org) (gcc version 4.6.3 (Debian 4.6.3-14) ) #1 SMP Debian 3.2.46-1
...
...
[ ok ] Starting periodic command scheduler: cron.

Debian GNU/Linux 7 pandora ttyS0

pandora login:

Success!

In-chroot Script

| Comments

This is one of those “use it so much I forget it’s there” scripts. I do a fair amount of system image and package building work. Both of these inevitably involve using chroot, and I can never remember the precise combination of remounts and copies you have to do to make the chroot work “properly”.

Here’s the script I wrote to take care of it for me. It will sudo at the right points, and simply execs its arguments in the chroot (defaulting to /bin/bash if you don’t pass it anything).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
#!/bin/bash

if [ -z "${1:-}" ];
then
    echo "usage: ${0} <fsroot>"
    exit 1
fi


if [ -z "${SUDO_USER:-}" ];
then
    sudo $0 $@
    exit $?
fi


FSROOT=${1}
shift

mount proc ${FSROOT}/proc -t proc
mount /dev/pts ${FSROOT}/dev/pts -t devpts
mount /sys ${FSROOT}/sys -t sysfs

cp /etc/resolv.conf ${FSROOT}/etc/resolv.conf

chroot ${FSROOT} "${@:-/bin/bash}"

EXITCODE=$?

umount ${FSROOT}/proc
umount ${FSROOT}/dev/pts
umount ${FSROOT}/sys

exit $EXITCODE

The quoting around executing the command is slightly hairy thanks to bash’s argument array handling, but this does Just Work.

I have this script saved to ~/bin/in-chroot, so I can use it like this:

1
$ in-chroot wheezy-root apt-get update

or

1
2
$ in-chroot wheezy-root
# vi /etc/hosts

and so on. This is something I use almost every day.

How I Ruby, Part 1: Development

| Comments

In this article I’ll describe the toolset I use to do ruby development. It’s not complicated. It is pleasingly robust. I’ve hacked together a couple of simple tools to make it easier.

I do all my development (in fact, all my everything) on Debian stable (currently Wheezy), but everything here should apply equally to Ubuntu and, with a following wind, OS X.

ruby-install

I install rubies with ruby-install from here. This is a simple installer which doesn’t need re-installing or upgrading whenever a new ruby release comes out.

When run as root, it installs an interpreter under /opt/rubies. As a non-root user, they go under $HOME/.rubies. For development on my own machines, I typically don’t care which of these I go for.

ruby-install will also happily install to a different chosen location, which allows for a neat trick I’ll come back to later.

I like ruby-install because it takes care of remembering which system dependencies each ruby needs. I can never remember the exact list of Debian packages MRI needs to work properly out of the box, so this avoids a google trip each time I set up a new machine.

chruby

Once I’ve got as many rubies as I can eat installed, I use chruby from here to select which one is active at any given time. I don’t use the chruby function to do so, though. The advice given here is good. I do all my ruby project work in subshells, which I launch with chruby-exec. It looks like this:

1
chruby-exec 1.9.3-p429 -- bash

That gives me a totally contained environment with the $PATH set correctly to use the chosen ruby. This, to me, is a far better way to arrange the environment than relying on a function. The problem with using a function is that when you want to switch back to your original settings, you’re relying on the function being able to accurately undo the changes it made. If you’ve got anything else modifying the same environment variables, the chances of getting this wrong go up dramatically.

With a subshell, when I want to switch environments all I need to do is kill that subshell. Ctrl-d. Any changes I (or any other tool) have made are just thrown away, there’s no need to track any state. The top-level shell is treated as immutable, so you can never get stuck.

gemsh

While chruby has support for switching gem sets with chgems, I don’t use it. I prefer to have a $GEM_HOME in each project directory, right next to the source. This keeps everything nicely separated. gemsh (from here) is a little tool I wrote along the same sorts of lines as Python’s virtualenv to set this up for me. If you run this:

1
gemsh .gems

then you get the following directory structure:

1
2
3
4
5
.gems/
  bin/
    activate
    exec
  gem_home/

.gems/gem_home is where gem install will install gems to. .gems/bin/activate is a chunk of shell script you can source into a shell to set the environment variables to make that happen. .gems/bin/exec is an executable script which sources .gems/bin/activate, and then exec’s its arguments.

All very simple. Here’s how I might install a gem:

1
.gems/bin/exec gem install bundler

…and the bundler gem will then get installed into .gems/gem_home.

If you’ve been following along, you can probably predict the next step: running a subshell with $GEM_HOME set properly. That’s simple, too:

1
.gems/bin/exec bash

If I’ve just launched a new terminal and want to load up the complete environment for a project, combining the chruby-exec and gemsh calls gives me this:

1
chruby-exec 1.9.3-p429 -- .gems/bin/exec bash

Pick the ruby, set up the gemset, and launch the subshell. Simples.

Simpleserer

I got bored of typing out chruby-exec.... .gems/bin/exec... every time, so I wrote a little tool to wrap the two together. It’s called rv, and you can get it here. To set up a project, you call rv-init with the ruby version you want to use:

1
rv-init 1.9.3-p429

That will create this structure:

1
2
3
4
5
6
.rv/
  1.9.3-p429/
    bin/
      exec
      activate
    gem_home/

Under the bonnet, rv-init just calls gemsh. Now, if you want to be able to run your project under more than one version of ruby, you can just run rv-init again, and give it a different version string.

Now when I come to a project, I run:

1
rv 1.9.3-p429 bash

…and there’s my subshell with the right ruby and gemset selected. You can leave bash off the command if you want - rv 1.9.3-p429 will default to launching $SHELL. I’ve added the $VIRTUAL_ENV variable (set by gemsh) to my prompt, so I can tell when I’ve got a project environment activated.

The only thing remaining is to add .rv/ to .git/info/exclude to make sure it’s locally ignored.

That’s it

That’s really all there is to it. Counting up the lines of source across chruby, rv and gemsh, that’s as complete a ruby development environment manager as I need, in a couple of hundred lines of bash.

This simplicity is entirely intentional. The more time passes, the more I find myself believing in small tools doing one thing well.

Now, for what this setup doesn’t handle. The two features which play on my mind the most are $GEM_PATH support and auto-activation of project environments.

$GEM_PATH support is needed if, for instance, you want to install a single copy of bundler and have access to that copy from whichever environment you’re working in. It helps prevent duplicate installs (which waste space), and gives a separation between the tools you’re using to work on a project and the code dependencies of that project. The bin/activate script in gemsh hardwires $GEM_PATH to be the same as $GEM_HOME, so you really don’t have access to anything outside the project. This hasn’t irritated me enough yet to do anything about. You can pass a list of gems to install to rv-init like this:

1
rv-init 1.9.3-p429 bundler rails

…which takes most of the pain away. Maybe I’ll add $GEM_PATH support to gemsh at some point, but for now I just don’t feel the need.

As far as auto-activation of project environments goes, I don’t like the idea in concept, and have never really felt the need for it. Maybe it’s because most of the projects I work on don’t have a single “default environment”. I’m always testing against another version of ruby, either to try to flush out bugs (running threaded code on 1.8 is unexpectedly useful for this, it’s got a handy deadlock detector) or as preparation for the inevitable N+1 production ruby upgrade.

If I did feel the need for an auto-environment switcher, I’d probably reach for something like autoenv.

The neat trick with ruby-install

The fact that ruby-install doesn’t get between you and the $PREFIX-sensitivity of ruby’s install process means that we can, in principle, install ruby itself into the .rv directory. This is another trick borrowed from virtualenv; I’ve not tried it out yet, but there’s something very enticing about having the project, the interpreter and all its dependencies inside a single directory.

I’m planning on trying this in an experimental rv branch, but if someone else felt like trying it out first…

A final word on Bundler

Without mincing words, managing gem versions is a pain. I use bundler to install gems and lock their versions BUT I don’t use bundler’s cache, its binstubs, or anything other than to use it to:

  1. turn a Gemfile into a Gemfile.lock; and
  2. to install the contents of that Gemfile.lock into a $GEM_HOME.

I find this reduces the number of things that can possibly go wrong to a tolerable level.

I still believe that these two separate jobs should be done by small, independent tools, but that’s a thought for a future post.

A Go-lang Lisp

| Comments

Back in 2010, Nurullah Akkaya published his implementation of John McCarthy’s “Micro-manual for Lisp” paper. I thought it might be interesting to port this implementation into Go as a learning exercise. This is my first non-hello-world Go program, so inevitably there’ll be a fair bit of non-idiomatic code here. Bear with me; I’ll update it at some point in the future when I’ve got more Go under my belt.

I won’t go into too much detail; the full source is here.

Interfaces

The interesting part of this for me was being able to replace the idiomatic but, to me, ungainly enum-tagged structs and object * references with Go’s interfaces to be able to take advantage of Go’s more powerful type system.

Here’s the common interface for all the moving parts of my lisp:

1
2
3
type Object interface {
  String() string
}

The contents of the interface aren’t really important - I’m really using this to convince the type system that I know what I’m doing. I’m defining the String() function to make print() work, mainly so that I can just drop my Objects into fmt.Print* and have it Just Work. I did consider adding a function to print to a stream, but I don’t need that yet.

The four basic object types - atoms, cons cells, functions and lambdas - are implemented in the obvious way:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
type AtomObject struct {
  name string
}

type ConsObject struct {
  car, cdr Object
}

type FuncObject struct {
  fn ObjFunc
}

type LambdaObject struct {
  args, sexp Object
}

ObjFunc, needed by FuncObject, will be our basic “callable” function type, taking two Objects and returning an Object:

1
type ObjFunc (func (Object, Object) Object)

LambdaObjects and FuncObjects both also implement the Evaler interface, which simplifies the implementation of eval() slightly:

1
2
3
type Evaler interface {
  Eval( sexp, env Object ) Object
}

Strings

One other major difference between Go and C is in string handling. Go’s creators have obviously learnt a lot about how to fix C’s shortcomings in that area. You can see this at work in my next_token() function:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
func next_token( in io.RuneScanner ) Object {
  ch,_,err := in.ReadRune()

  for unicode.IsSpace( ch ) && err == nil {
      ch, _, err = in.ReadRune()
  }
  if err == io.EOF {
      os.Exit(0)
  }

  chstr := string( ch )

  if chstr == ")" {
      return atom(")")
  } else if chstr == "(" {
      return atom("(")
  }

  buffer := make( []rune, 128 )
  var i int

  for i = 0; !unicode.IsSpace(ch) && chstr != ")"; i++ {
      buffer[i] = ch
      ch,_,_ = in.ReadRune()
      chstr = string( ch )
  }

  if chstr == ")" {
      in.UnreadRune()
  }
  
  // The slice here is essential: go strings aren't null
  // terminated, so the full 128-rune length would be returned
  // if we didn't slice it.  This breaks string comparison.
  return atom( string( buffer[0:i] ) )
}

Everything here is done in terms of the rune type, which is completely distinct from the byte[] type, has different reader and writer functions, and is generally geared up for unicode input in a way that would have been more cumbersome - and crucially, easier to get wrong - in the original. Also note the gotcha in the comment at the end: if you don’t manage your string lengths explicitly, string comparisons break. I can understand why this was done, but it caught me completely off-guard and I spent a good couple of hours trying to find a mistake I’d made elsewhere.

Finished?

Obviously not. I think this was an excellent introduction to Go, and I’ll certainly be keeping an eye on the ecosystem to see how it evolves. I can certainly see Go becoming a tool in my daily toolbelt. My experience with the toolchain in getting this to work has been nothing but pleasant.

I’m going to keep working on this interpreter as well. There are many, many directions it could go in, but only so many hours in the day. Most importantly, if you’re reading this and you can see anything I can do to improve this code, have at it. Post a gist, and let me know.

Edited to add

I didn’t put a license on the code when I originally posted it. Oops. I’ve added one now; it’s MIT’d.

Rescuing Exception

| Comments

Over the past few months, I’ve been thinking about when it might be correct to say rescue Exception instead of rescue StandardError, or a more specific exception class. This line of thought was first triggered by a particularly hairy debug session which was made extremely difficult because, unbeknownst to me, some library code did a rescue Exception at the top-level of a Thread, where I was expecting Thread.abort_on_exception to explicitly break and tell me what was happening. The actual exception class was a SystemStackError, being triggered by a faulty recursive method. The above experience lead me to the axiom: Never Rescue Exception.

This highlights the problem with a vanilla rescue Exception: it’s way too broad. For writing day-to-day code, it’s very unlikely that you want to rescue the Exception subclasses which are not StandardErrors.

Since then, putting a couple more projects into production has made me think harder about this particular piece of dogma, and I’ve refined it somewhat:

  1. If you are writing a library, Never Rescue Exception.
  2. If you are writing an application (that is, your code is responsible for the top-level process), then Rescue Exception At Most Once.

If you write rescue Exception, here’s what you are claiming about your rescue clause:

  • You can handle running out of memory. (NoMemoryError)
  • You can handle misconfiguration of $LOAD_PATH. (LoadError)
  • Syntax errors don’t matter. (more ScriptError)
  • You’re responsible for low-level signal handling. (SignalException)
  • You control shutting down the process (SystemExit)
  • …and yes, you know what to do when any rogue method blows the stack. (SystemStackError)

Any time you actually need to handle any of these situations, you’ll definitely know about it, and in most cases all you will want to do is log an error somehow and exit(1).

There’s only one situation I can think of where it’s reasonable for a library to handle any of these, and that’s where that library is responsible for laoding plugin code. If you want to evaluate that code in the main interpreter (and that’s not unreasonable), then you definitely want to isolate the rest of the application from misbehaving plugins so that you can track down the problem more easily. However, in that case the correct exception to be rescuing is ScriptError, not Exception, so the rule still applies.

Neat Emacs Trick #1

| Comments

Avdi Grimm has a post on using a shortcut ec script for emacsclient here. I used something like that for a few months, but got quickly frustrated with one peculiar quirk of my development setup.

I do the vast majority of my development in ruby, in a terminal window, and almost all of it is TDD using minitest/unit. Now, when I get a test failure, it looks something like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
$ ruby -I. test_foo.rb 
Run options: --seed 41149

# Running tests:

F

Finished tests in 0.177699s, 5.6275 tests/s, 5.6275 assertions/s.

  1) Failure:
test_frobnication(TestFoo) [test_foo.rb:7]:
Expected: 23
  Actual: 42

1 tests, 1 assertions, 1 failures, 0 errors, 0 skips

The bit I’m most interested in is this line:

1
test_frobnication(TestFoo) [test_foo.rb:7]:

Specifically, if I want to take a look at the code around this failing assertion, this is the bit of information I want:

1
[test_foo.rb:7]

Now, I don’t know if it’s a quirk of my terminal or not, but when I double-click the filename, it selects the string test_foo.rb:7. The same sort of thing happens with exception backtraces, too. The workflow I want is: double-click the filename to copy, type “e c <space>”, middle-click to paste, hit enter. However, that pesky :7 screws everything up. It’s not part of the filename, so I’ve got to edit it out of the command before telling emacsclient to do its thing. This breaks the flow, because I actually have to think in the middle of the sequence.

Doing that also throws away useful information. What if, instead of just opening the file, I could have emacs navigate to the line as well? That would save me two manual operations and save me a little bit of mental effort.

Fortunately, emacsclient doesn’t just open files, it can also remotely run arbitrary elisp expressions. By a bit of trial and error, I found this set of calls would make emacs do what I want, given a separated filename filename and line number linenum:

1
2
3
(let ((buf (find-file filename)))
  (goto-line linenum buf)
  (select-frame-set-input-focus (window-frame (get-buffer-window buf))))

The only question then is, how best to pass this to emacs? Given that I was learning Haskell when I first tackled this, I thought it would be a suitable challenge for my new-found monad-wrangling skills for my first attempt to be called ec.hs:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
#!/usr/bin/env runhaskell

import System.Environment (getArgs)
import System.Cmd (rawSystem)

splitIdentifier :: String -> (String, String)
splitIdentifier str =
    let
        cbrk = break (== ':')
        (filename, rest) = cbrk str
        linenum = case rest of
                    ':':stmt -> fst $ cbrk stmt
                    otherwise -> "1"
    in
      (filename, linenum)

buildCommand :: (String, String) -> (String, [String])
buildCommand (filename, linenum) =
    ("emacsclient", ["-ne", lisp])
    where
      lisp = "(let ((buf (find-file \"" ++ filename ++ "\")))" ++
             "(goto-line " ++ linenum ++ " buf)" ++
             "(select-frame-set-input-focus (window-frame (get-buffer-window buf))))"


run (exe, args) = rawSystem exe args

main = do
     identifier:_ <- getArgs
     exitCode <- run $ buildCommand $ splitIdentifier identifier
     putStrLn ""

To say that I am unimpressed with the verbosity of this code would be an understatement. I’d be happy for this to be golfed into oblivion. Nevertheless, it works.

Here’s a ruby version of the same:

1
2
3
4
5
6
7
8
9
10
11
#!/usr/bin/env ruby

filename, linenum = *ARGV.shift.split(":")
fail "Filename required" unless filename
linenum ||= 1

system "emacsclient", "-ne", <<-LISP
(let ((buf (find-file "#{filename}")))
  (goto-line #{linenum} buf)
  (select-frame-set-input-focus (window-frame (get-buffer-window buf))))
LISP

Naturally there’s a problem with both of these: filenames can contain colon characters, and both of these scripts will break if they’re given filenames like that. I’m happy enough with that; I don’t think I’ve ever seen a colon in a directory name, and I’ve certainly never written a ruby file with a colon in its name either.

There is, of course, another way to achieve all this, and that’s to make emacs do as much of the work as possible. Stick these in your init.el:

1
2
3
4
5
6
7
8
9
10
11
(defun backtrace-find-file (filename linenum)
  (let ((buf (find-file filename)))
    (goto-line linenum buf)
    (select-frame-set-input-focus
      (window-frame (get-buffer-window buf)))))

(defun backtrace-visit (ref)
  (let* ((matches (split-string ref ":"))
   (filename (car matches))
   (linenum (string-to-number (or (cadr matches) "1"))))
    (backtrace-find-file filename linenum)))

Now all we have to do is have emacsclient call backtrace-visit with the correct argument:

1
2
3
#!/bin/bash

emacsclient -ne "(backtrace-visit \"${1}\")"

This is probably the simplest thing that could possibly work.

TDD and Simplicity

| Comments

Oh, I didn’t write tests for that. It’s too hard to test.

Heard that before? I’ve made this argument myself in the past, but it’s only recently that I’ve figured out precisely why it’s a stupid argument.

On the surface, and in context, it can seem reasonable. The code in question is often either interacting with the OS via system calls, or is involved in multi-process coordination, os else involves fiddly thread interaction. These can all be notoriously tricky to get right, and are often just hacked into shape until they “work”.

Nothing I’ve just said justifies not writing unit tests. Quite the reverse.

One of the benefits which proponents of TDD claim is that the code you end up with is less coupled, more clearly factored, and in general easier to understand than otherwise. In general, I’ve found this to be broadly true. I think this happens because TDD guides you towards teasing the problem apart into simple parts which are individually easy to reason about.

With this in mind, it’s clear that an argument for skipping tests when the interactions are complex is an argument to make code which is difficult to get right harder to understand. This does not strike me as a sane rationale.

If you look back at the situations where this argument raises its head, you’ll note that they all involve at least two separate issues: firstly discovering the state of “the system” (where the “system” is either the OS, a bundle of processes, or a bundle of threads), and secondly applying application logic to decide how to change that state. All the cases I’ve seen which provoke the “it’s too hard” response have, to borrow an expression from Rich Hickey, complected the interactions with the system with the decision logic behind the interactions.

Time for an example. In this code, which roughly paraphrases some production code I refactored recently, we’re connecting to a remote host with an SSH master socket, then giving the user a console over that socket. We need to do this so that we can distinguish between a failure to connect to the remote host causing a time-out, and the user successfully connecting and using the console.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
class SshConsole

  def initialize( user, host )
    @user = user
    @host = host
  end


  def connect(timeout_time=Time.now+20)
    pid = fork{ exec "ssh -fMN -S/tmp/foo.sock #{@user}@#{@host}" }

    until Time.now > timeout_time
      done_pid, status = Process.waitpid2( pid, Process::WNOHANG )
      unless done_pid
        sleep 1
        next
      end

      # Still here? Then we're done, find out what happened

      # Return if the master socket process was killed:
      return false if status && !status.exited?
      # Return if there was a problem setting up the master socket
      return false unless status && status.exitstatus == 0

      # Still here? Then the master socket setup worked.
      begin
        # Launch the ssh console
        system "ssh -S/tmp/foo.sock #{@user}@{@host}"
      ensure
        # Kill the master socket when we're done
        system "ssh -NOexit -S/tmp/foo.sock #{@user}@#{@host}"
      end
      return true
    end

    # If we fall through to here, the master socket process failed to
    # connect before timeout_time.

    Process.kill("TERM", pid)
    Process.waitpid2( pid )
    return false
  end


end

Hopefully you can see how this is not code that would be simple to unit test. We have a fair amount of system interaction which would necessitate either a lot of messy, fragile mocking, or actual forking of SSH processes and faking of network failures to ensure coverage. Neither of these is ideal, and the reason this is hard is because the code above mixes up discovering the state of the system with acting on that state. I rewrote it to look like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
class SshMasterProcess
  def initialize( user, host )
    @user = user
    @host = host
    @pid = nil
    @status=nil
  end

  def launch
    @pid = fork{ exec "ssh -fMN -S#{socket} #{@user}@#{@host}" }
  end

  def quit
    system "ssh -NOexit -S#{socket} #{@user}@#{@host}"
  end

  def finished?
    waited_pid, @status = Process.waitpid2( pid, Process::WNOHANG )
    !!waited_pid
  end

  def killed?
    @status && !@status.exited?
  end

  def succeeded?
    @status && @status.exitstatus == 0
  end

  def kill
    Process.kill(9, @pid)
    Process.waitpid2( @pid )
  end

  def socket
    "/tmp/foo.sock"
  end
end


class SshConsole

  def initialize( user, host, ps_builder=SshMasterProcess )
    @user = user
    @host = host
    @ps_builder = ps_builder
  end


  def connect( timeout_time = Time.now + 20 )
    master = @ps_builder.new( @user, @host )
    master.launch

    until Time.now > timeout_time
      unless master.finished?
        sleep 1
        next
      end

      return false if master.killed?
      return false unless master.succeeded?

      begin
        system "ssh -S#{master.socket} #{@user}@#{@host}"
      ensure
        master.quit
      end
      return true
    end

    master.kill
    return false
  end
end

In a way, this is a fairly typical refactoring: I’ve extracted behaviour into a class to enable dependency injection. The reason it’s interesting in this context is because I’ve isolated all the functionality related to identifying and modifying the state of the SSH master process away from the logic which drives that state. This means that the logic can be tested without going outside the test process. The structure of the logic hasn’t changed at all, but by adding a layer of indirection in the form of the SshMasterProcess class, the code has become simpler and much easier to test.

The take-away message from this is that when code looks hard to test, you should search for overlapping concerns, and keep an eye out for overlap between “system” state and modifications of that state.

Log Level Rationale

| Comments

Ruby’s Logger class, and the ruby logging ecosystem in general, has grown up with 5 standard log levels: FATAL, ERROR, WARN, INFO and DEBUG. There are no generally accepted guidelines for how these should be used, other than that those “high” on the list should be used in “more serious” situations than those lower down. This is not detailed enough for a useful policy. Better guidelines would mean that your logs would become more predictable and more useful.

I have been experimenting with a set of rules for using the different log levels over time which I think works well, and gives me confidence that logging at a given level will show only the information that I am actually interested in. Here’s what the different levels indicate:

  • FATAL

    The process as a whole can no longer continue. Some fundamental assumption has been violated. We may want to keep track of what happens during the shutdown process, so it’s perfectly possible that the FATAL message is not the last message in the log. There’s no assumption that the next call after the log(:fatal, msg) call is an exit 1 or the like, but if you’re following the “fail early and hard” principle then it probably should be. Out of memory errors are a good example here.

  • ERROR

    A single “request” or “operation” (however those terms are best defined in the context of the application) must terminate because something went wrong, or an assumption pertinent to the current operation has been violated in a way which can be contained. ERROR messages correspond roughly to exceptions in the classic sense: an unexpected event occurred which we cannot, or should not, handle within the main code body, and signal that while the process is safe, the current operation will be dropped on the floor. As with FATAL, this may not be the final log message generated by this operation.

  • WARN

    The current operation has fallen off the happy path in an anticipated way, and we are guiding execution flow within the main body of the code to a failure response. An authentication failure, for instance, might be logged at WARN. The steps involved in handling that authentication failure, however, would not.

  • INFO

    What we are doing, and, if relevant, to whom. INFO messages do not make distinction between whether we’re on a happy or sad code path. If they contain run-time data, it is only the minimum required to identify the entities involved in the operation. Since we expect to generate INFO-level messages frequently, they should be terse. An incoming HTTP request would be logged at INFO.

  • DEBUG

    These describe how and why we are doing what we are doing. Like INFO, they do not make any distinction between whether we’re at an execution point corresponding to “good” or “bad” behaviour. On the assumption that INFO messages will also be present at the appropriate points, any run-time data which will be used in a conditional expression would be suitable for a DEBUG message.

You’ll note that INFO and DEBUG are closely related. This is not an accident: I’ll often have pairs of log commands looking like this:

1
2
log :info, "Reticulating splines"
log :debug, "Spline parameters: #{params.inspect}"

This set of rules means that I can set the log level to DEBUG for an execution trace, to INFO for a user activity log, and to WARN for a user problem log.

As a purely practical matter, I can’t envisage a situation where I’d raise the log level above WARN, and I’m unlikely ever to raise it above INFO unless I’m using separate appenders. ERROR and FATAL are most useful as locatable tags in a file with more context. That being said, if you don’t want to run your log gathering at a low level all the time, one option might be to dynamically alter the log level. You could lower the log level to INFO for the current operation when you hit a WARN situation, and lower it to DEBUG when you hit an ERROR or FATAL. I’ll be trying this on one project shortly and post when I’ve got a clear picture of how well this works.

The thought does occur to me that a hierarchy of levels isn’t the right way to represent what we try to do with logging, but that is a subject for a future post.

Hosting Octopress on BigV

| Comments

Octopress is a very fun little blogging platform. As you can see, it’s what I’m using here. It’s a static site generator with enough bells and whistles to keep me happy, but which doesn’t get in the way.

Anyway, that’s not the point. The point of this post was to demonstrate how I went about provisioning this server. It’s hard to imagine anything simpler.

First: I needed a server. I’ve been working on BigV, so it seemed appropriate to use a BigV vm as a host.

Here’s the top-to-bottom set of commands I used:

1
2
3
4
5
6
$ bigv group new blackkettle
$ bigv vm new diner.blackkettle \
  --vm-discs=25G \
  --vm-memory=1 \
  --vm-cores=1 \
  --vm-distribution=symbiosis

And that’s it. That gives me a machine with the hostname diner.blackkettle.alex.uk0.bigv.io, 25GiB of disc space, 1GiB of RAM and 1 core. Totally overkill for a static site, but the smallest it’s feasible to provision, and I’ve got a little headroom there if I want to get clever later.

Symbiosis is Bytemark’s Debian derivative which not only takes care of a lot of sensible defaults for web hosting, but also automates away boring configuration details by generating DNS and web server details based on directory names.

I connected to the VM’s serial console with bigv vm connect diner.blackkettle, logged in with the root password generated as part of the imaging process, created the folder /srv/blackkettle.org/public/htdocs, and put my public key in ~/.ssh/authorized_keys*.

At this point (given a following DNS wind), anything uploaded to that directory would be served at http://blackkettle.org/. Octopress needs the webroot to be set in its Rakefile. The relevant section for me looks like this:

1
2
3
4
ssh_user       = "admin@diner.blackkettle.org"
ssh_port       = "22"
document_root  = "/srv/blackkettle.org/public/htdocs"
deploy_default = "rsync"

Now, to publish the site in its as-before-you-seen-glory is to do:

1
2
$ rake generate
$ rake deploy

And that’s it. Request a VM, make a folder, upload content. Done.


* This shouldn’t have been necessary. The imaging process should have done this for me. It’s possible I have a bug to track down.