Category Archives: Ubuntu

All clear!

Testing of OpenStack

I’d like to take a couple of minutes of your time to talk about testing of OpenStack. Swift has always had very good test coverage, and Glance also does pretty well, so I’ll mostly be focused on Nova.

(Psst… If you can’t be bothered to read the whole thing, just skip down to the how you can help section.)

Unit tests

Unit tests are by far the easiest to run. They’re right there in the development tree, a simple ./run_tests.sh away. You don’t need a complicated hardware setup, just a source code checkout.

They each exercise a small portion of the code in isolation to verify that they live up to their “contract”. More often than not, this contract is implicit. There’s no documentation of its input, output, or side effects, and maybe there doesn’t have to be. In many cases things get split up simply for readability reasons (larger routines that have grown out of control get split into smaller chunks) or to ease testing, so they’re not actually written expecting to be called from anywhere else. Documentation for all these things would be *awesome*, but a unit test should be the minimum required.

Functional tests

Unit tests are great. However, verifying that each piece of the puzzle does what it says on the tin is of little use if putting them all together doesn’t actually do what you set out to achieve. This is where we use functional tests. An example might be verifying that when you invoke a frontend API method that is supposed to start a virtual machine, a virtual machine actually ends up getting started in a mock hypervisor with all the correct things having been put in place along the way.

In my experience, almost every time an issue is caught by this type of test, it’s an indication that the unit tests are either wrong (e.g. when X goes into a particular routine, it checks that Y comes out, but for everything else to work, Z was actually supposed to come out)  or don’t test all the edge cases. So, while a failure at this level should probably involve fixing up (or adding new) unit tests, these tests are indispensable. They verify the cooperation between the various internals, which is easy to miss when staring at each tiny little part in isolation (particularly in a piece of software like Nova that is full of side effects).

(In Nova, functional and unit tests all live in the same test suite)

Integration tests

Unit and black box tests are great, but what end users see is what really matters. If someone deploys all the various OpenStack components and put them together and something ultimately doesn’t work, we’ve failed. It’s all been futile.

Integration tests are often the easiest to write. When dealing with internals, it’s easy to punt on a lot of things like “should this method take this or that as an argument?,” “ideally, this db call shouldn’t live here, but it’ll have to do for now,” etc., but when it comes to what the end users sees, everything must have an answer. We can’t not have firm, concrete, simple, long-lived answers to questions like: “If I want to start a virtual machine, what do I do?,” “which argument comes first for this API call?,” etc. Hence, writing tests that start a virtual machine and then later makes sure that it started properly is rather forgiving. It’s also reassuring to end-users to know that their exact use cases are verified to work.

Again, ideally nothing should ever be caught here. If it does, it means that something slipped through a crack left by both the unit tests and black-box tests, or maybe the real KVM doesn’t act like we expected when we wrote its mock counterpart. Everything caught here should end up in a unit test somewhere once the culprit has been found.

Where do we stand today?

Unit and functional tests

As mentioned, nova’s source tree includes a test suite, comprised of both unit and functional tests. We have a Jenkins job that tracks how much of Nova is being exercised by the test suite. At the time of this writing, we have around 74% coverage. Bear in mind that if a particular line is exercised by either a unit test or a functional test (or both, of course). At our last design summit, we agreed that we’d work on improving this coverage, but clearly there’s a long way to go (that number should be in the (very) high nineties).

Integration tests

As for integration tests, there are a number of separate efforts:

Where are we going? (a.k.a. how you can help)

Unit and functional tests

I think this is easily where we have the most work to do. Jenkins keeps track of what is covered and what isn’t:

There’s clearly lots of room for improvement. I’d like to encourage anyone who cares about QA to grab a random bit of code that isn’t yet covered by tests and add a test for it. Feel free to start with anything small and seemingly insignificant. We need to get the ball rolling. Small changes also makes the review easier.

I’ve started going through our coverage report and filing bugs about missing unit tests. Some are just a few simple statements that need tests, others are entire modules that are almost testless. Take a look and feel free to get in touch if you need help getting started.

Integration tests

Over the next month or so, we’re hoping to collect all these efforts (and any others out there, so please let me know!) into one. The goal is to have a common set of tests that we can run against an OpenStack intallation (i.e. all the various components that make up an actual deployment) to get early warning if something should break in a particular configuration. So, if you have anything set up to automatically test OpenStack, please get in touch. If there’s a particular configuration you care about, we want to make sure we don’t break it, so we need your help finding a good way to deploy bleeding edge OpenStack code onto your test installation and run a bunch of tests against it.

PPA management tools

We use PPA’s quite heavily in OpenStack. Each of the core projects have a trunk PPA and a milestone-proposed PPA. Every commit to our bzr trunk branch results in an upload to the trunk PPA, and every commit to our milestone-proposed bzr branch results in an upload to (you guessed it) the milestone-proposed PPA. Additionally, we have a common openstack-release PPA for each of our major releases, where we combine all the projects into one PPA, for simpler distribution.

This poses a number of challenges.

We support every Ubuntu release since Lucid, but most of them lack new enough versions of various stuff (and in some cases, the packages are missing altogether). This means we backport a bunch of things to the various trunk PPA’s, and at the right moments we need to copy all these dependencies either from the trunk PPA to the milestone-proposed PPA (when we branch off for a new milestone) or from the milestone-proposed PPA to the common release PPA (at final release time).

This used to involve a lot of mucking around with Launchpad’s web UI which is not only boring and tedious (checking half a bajillion boxes is even less fun than it sounds), but also error prone, since it’s all manual.

I decided to write a number of tools to help make this simpler. So far, these tools are:

  • copy-ppa-pkg.py

    Simply copies a package from one PPA to another.

  • detect_ppa_mismatches.py

    This one takes a number of PPA’s as arguments, and finds packages that exist in more than one of them, but at different versions. During the development cycle, this is not much of a problem since most people only run the trunk version of a single project, but when we shove them all together in one great, big PPA, it could mean that one of the projects suddenly is being run against another version of one of its dependencies than during the dev cycle.

  • sync-ppas.py

    This one takes all the packages from one PPA and copies them to another and removes stuff from the destination PPA that’s been removed from the source PPA. It’s handy if have a PPA with all your stuff in it, it’s all been QA’ed together and is in good shape, and you want to sync it all over into a “stable” PPA in one fell swoop.

  • list-ppa.py

    Lists the contents of a PPA. Simple as that.

I’ve branched lp:ubuntu-archive-tools and added these tools to lp:~openstack-release/ubuntu-archive-tools/openstack. I can’t really decide if I think they belong inlp:ubuntu-archive-tools, but if someone else wants them I can look into getting them merged back.

New GPG key – Please help :)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

(Thanks to Colin Watson for the template for this post)

I've finally gotten around to setting up a new, strong (4096 bit) RSA-
based GPG-key, and will be transitioning away from my old 1024 bit DSA
key. The old key will continue to be valid for some time, but I prefer
all future correspondence to use the new one. I would also like to
ensure that this new key is well-integrated into the web of trust. This
message is signed by both keys to certify the transition.

The old DSA key was:

pub   1024D/E8BDA4E3 2002-02-22
      Key fingerprint = 196A 89ED 78F3 9047 2A36  F327 A278 DF5E E8BD A4E3

The new RSA key is:

pub   4096R/9EAAF9C5 2011-06-15
      Key fingerprint = E6BC C692 3553 A464 8514  28D1 EE67 E7D3 9EAA F9C5

To fetch my new key from a public key server, you can run:

  gpg --keyserver subkeys.pgp.net --recv-keys 9EAAF9C5

If you already know my old key, you can now verify that the new key is
signed by the old one:

  gpg --check-sigs 9EAAF9C5

If you don't already know my old key, or if you're extra-paranoid, you
can check the fingerprint against the one given above:

  gpg --fingerprint 9EAAF9C5

If you have previously signed my old DSA key, and if you're satisfied
that you've got the correct new RSA key, then I'd appreciate it if you
would sign my new key as well:

  caff 9EAAF9C5

The caff program is in the signing-party package in Debian and its
derivatives, including Ubuntu. Please be careful to generate signatures
that don't rely on the weakening SHA-1 hash algorithm, which requires
some careful configuration even if you've already configured gpg
correctly. See http://www.gag.com/bdale/blog/posts/Strong_Keys.html for
the gory details.

Thanks,
Soren Hansen
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)

iQIcBAEBCgAGBQJN+KNRAAoJEO5n59OeqvnFibwP/jLC+VXFBxMbxqxcXTolr19H
5octVasjlO7Ub9oqH/7/Js168g5gnzSefraWTd0kELc3fXVVsvZuNUtfBhrSYMX5
EQnRr+P1HGi5apFECc3hd7dNQIfbypkQ4UAK3T2nYgkYLATnt04CX0LUz/wnvHJD
fnNRHcM0A8pTn9oJB6LgXpJUspz3pqTFyEoQTkY0/QPcfLbeTLqYG+slSp8+I35H
I+PXl4XrbSsbcJTjpRRllodb1d5sFYZr827ZqPksEeiozGfwpXLZ/DtaIrtE3z3T
AVPCeG/9VCFtcvgqPQnhcbsS6RrGVkE5fUFxgZzERlAxAkkPi+WhwinASmkvOtmE
0m6fkhEeMCYvqvrDoeR8mZvgODIZjP7aIvNKDpWBA9mxC7k171LHdnsluB3xN0sH
++8/w5ESy7GpFxveLk6jR5ytfTxVLUgAASoqJbsxpMqSz/5KNompdFy/Hu13PVel
afqQNfMLjV0QXrKvtmmPSbUs6bWhxwE04jsYAUQcFNBFyHMmYQdA5peikC8ad+JL
WJocQmRUeE6EVRKKSaBXJIihcRHigeTf+6qqaSdbTpeZ1iPSMyETmOW8ZzBaQ7F0
VO65BOOzFsD7Cuaxba427CFPvYo5F4Bi3Dtuwz1PtZlNKExFRuNZ4vhs6dRFenJT
kkeIQrFpJ6wF/DVrutMeiEYEAREKAAYFAk34o1EACgkQonjfXui9pOOVyACfeLci
yfBLmY3L9Abcmg6ggCVQLBAAoKDQfzhmoK5mk26dQToReFpJ80bq
=0yN5
-----END PGP SIGNATURE-----

Moving duplicity (and Deja-Dup) backups

In my last blog post I said that I had moved my backups from an external disk to Rackspace Cloud Files and promised I’d explain how.

Ok, so why bother? I had about 100 GB of data that was being backed up. I didn’t want to upload 99% of that, have my wifi go bonkers, and then have to start over (because Duplicity apparently isn’t very good at resuming). So, instead I wanted to make the initial backup to an external drive (the backup wouldn’t fit on my laptop’s hard drive) and defer copying it to Rackspace as time and connectivity permitted.

That was simple enough.

Once the first, full backup was made, I wanted incremental backups to go directly to Cloud Files, so I needed to get Deja-Dup to realise that there was already a backup on there.

This was the trickier bit.

When you ask Duplicity to interact with a particular backup location, it calculates a hash of the URI of it and looks that up in its cache to see if it knows about it already. If you’ve made a backup with deja-dup, you can go and look in $HOME/.cache/deja-dup. This is what I had:

soren@lenny:~$ ls -l $HOME/.cache/deja-dup/
drwxr-xr-x 2 soren soren 4096 2011-01-14 18:09 4e33cf513fa4772471272dbd07fca5be
soren@lenny:~$

You see a directory named after the hash of the uri of the backup location I used, namely “file:///media/backup” (the MD5 sum of which is 4e33cf513fa4772471272dbd07fca5be).

Inside this directory, we find:

soren@lenny:~$ ls -l /home/soren/.cache/deja-dup/4e33cf513fa4772471272dbd07fca5be/
-rw------- 1 soren soren 750938885 Jan 14 15:47 duplicity-full-signatures.20110113T170937Z.sigtar.gz
-rw------- 1 soren soren    653487 Jan 14 15:47 duplicity-full.20110113T170937Z.manifest
soren@lenny:~$

It contains a manifest and a signature file. These files in there have no record of the backup location. That information exists only in the name of the directory. Essentially, all I needed to do was to rename the directory to match the Cloud Files location. Being a bit cautious, I decided to copy it instead. The URI for a container on Cloud Files looks like “cf+http://containername”. Knowing this, it was as simple as:

soren@lenny:~$ echo -n 'cf+http://lenny' | md5sum
2f66137249874ed1fdc952e9349912d4 -
soren@lenny:~$ cd $HOME/.cache/deja-dup
soren@lenny:~/.cache/deja-dup$ cp -r 4e33cf513fa4772471272dbd07fca5be 2f66137249874ed1fdc952e9349912d4

The -n option to echo is essential. Without it, I’d have been calculating the MD5 sum of the URI with a trailing newline.

Before I ran deja-dup again, I made sure the two files above were copied to Cloud Files. If I hadn’t, the first time duplicity would talk to Cloud Files, it would realise that these files don’t exist on the expected backup location, hence the local cache of them must be invalid, so it would delete them. This happened to me the first time, so making a copy rather than just renaming the directory turned out to be a good idea.

All that was left to do now was to change my backup location in Deja-Dup. This should be simple enough, so I won’t go into detail about that.

The best part about this, I think, is that wasn’t until 5-6 days later, that my upload of the initial full backup finished. However, in the mean time, I was able to do incremental backups just fine, because all it needs to do that is the signature files from the previous runs.

Oh, and to actually upload the files, I used the “st” tool from Swift. Something like this:

soren@lenny:~$ cd /media/backup
soren@lenny:/media/backup$ st -A https://auth.api.rackspacecloud.com/v1.0 -U soren -K 6e6f742061206368616e636521212121 upload lenny *

It only took me 20 years..

tl;dr: I now have daily backups of my laptop, powered by Rackspace Cloud Files (powered by Openstack), Deja-Dup, and Duplicity.

I’ve been using computers for a long time. If memory serves, I got my first PC when I was 9, so that’s 20 years ago now. At various times, I’ve set up some sort of backup system, but I always ended up

  • annoyed that I couldn’t acutally *use* the biggest drive I had, because it was reserved for backups,
  • annoyed because I had to go and connect the drive and do something active to get backups running, because having the disk always plugged into my system might mean the backup got toasted along with my active data when disaster struck,
  • and annoyed at a bunch of other things.

Cloud storage solves the hardest part of this. With Rackspace Cloud Files, I have access to an infinite[1] amount of storage. I can just keep pushing data, Rackspace keep them safe, and I pay for exactly how much space I’m using. Awesome.

All I need is something that can actually make backups for me and upload them to Cloud Files. I’ve known about Duplicity for a long time, and I also knew that it’s been able to talk to Cloud Files for a while, but I never got into the habit of running it at regular intervals, and running it from cron was annoying, because maybe I didn’t have my laptop on when it wanted to run, and if I wasn’t logged in, by homedir would be encrypted anyway, etc. etc. Lots of chances for failure.

Enter Deja-Dup! Deja-dup is a project spearheaded by my awesome, former colleague at Canonical, Mike Terry. It uses Duplicity on the backend, and gives me a nice, really simple frontend to get it set up. It has its own timing mechanism that runs in my GNOME desktop session. This means it only runs when my laptop is on and I’m logged in. Every once in a while, it checks how long it’s been since my last backup. If it’s more than a day, an icon pops up in the notification area that offers to run a backup. I’ve only been using this for a day, so it’s only asked me once. I’m not sure if it starts on its own if I give it long enough.

A couple of caveats:

  • Deja-dup needs a very fresh version of libnotify, which means you need to either be running Ubuntu Natty, use backported libraries, or patch Deja-dup to work with the version of libnotify in Maverick. I opted for the latter approach.
  • I have a lot of data. Around 100GB worth. Some of it is VM’s, some of it is code, some of it is various media files. Duplicity doesn’t support resuming a backup if it breaks halfway, and I “only” have 8 Mbit/s upstream bandwidth.. That meant I had to stay connected to the Internet for 28 hours straight (in a perfect world) and not have anything unexpected happen along the way. I wasn’t really interested in that, so I made my initial backup to an external drive and I’m now copying the contents of that to Rackspace at my own pace. I can stop and resume at will. The tricky part here was to get Deja-Dup to understand that the backup it thinks is on an external drive really is on Cloud Files. I’ll save that for a separate post.

[1]: Maybe not actually infinite, but infinite enough.

Hudson and VMBuilder

Unhappy with the current state of VMBuilder, I recently decided to take a look at Hudson, hoping it can help improve quality going forward. Hudson is a “continuous integration” tool. This means that it’s a tool you use to apply quality control continuously rather than only either when you’re feeling bored or when a release is imminent.

I’ve set up Hudson with a number of jobs:

  • One monitors the the VMBuilder trunk bzr branch. Whenever something changes there, it downloads it, runs pylint on it, runs the unit tests (pylint and unit tests setup with help from a blog post by Joe Heck), and rolls a tarball. Finally it triggers the next job..
  • ..which builds an Ubuntu source package out of it, and triggers the next job..
  • ..which signs and uploads it to the VMBuilder PPA that I recently blogged about..
  • Last, but certainly not least, I’ve set up the very first completely automated, end to end VMBuilder test. It grabs the freshest tarball from Hudson, copies it to a reasonably beefy server, builds a VM, boots it up and upon succesful boot, it reports back that it all worked, and Hudson is happy. It doesn’t exercise all the various plugins of VMBuilder (not even close), but it’s a start!

VMBuilder in Lucid == lots of fail

Let it be no secret that I’m unhappy with the state of VMBuilder in Lucid (and in general for that matter). Way too many regressions crept in and I didn’t have time to fix them all. I still expect to do an SRU for all of this, but every time I try to attack the bureaucracy involved in this, I fail. I need to find a few consecutive hours to throw at this very soon.

Anyways, in an effort to make testing easier, I’ve set up a PPA for VMBuilder.

I’ve set up a cloud server that monitors the VMBuilder trunk bzr branch. If there’s been a new commit, it rolls a tarball, builds a source package out of it, and uploads it to that ppa. That way, adventurous users can grab packages from there and test things out before they go into an SRU. To do this, you simply run this command:

sudo add-apt-repository ppa:vmbuilder/daily

I’m also working on a setup that will automatically test these packages. The idea is to fire up another cloud server, make it install a fresh VMBuilder from that ppa, perform a bunch of tests and report back. To do this, I’m injecting an upstart job into the instance that

  1. adds the ppa,
  2. installs vmbuilder,
  3. builds a VM, which (using the firstboot option) will call back into the host when it has booted succesfully,
  4. sets up a listener waiting for this callback,
  5. waits for set amount of time for this callback.

If I get a response in a timely manner, I assume all is well. If not, it’ll notify me somehow.

The idea is to make it run a whole bunch of builds to attempt to exercise as much of the code base as possible.

I’ll try to make a habit of blogging about the progress on this as I know a lot of people are aggravated by the current state of affairs and this way, they can see that something is happening.

Not an April fool’s joke

Today marks the beginning of my second month working for Rackspace.

I’ve realised I haven’t actually blogged about my leaving Canonical, so this post doubles as an announcement about that, I suppose.

A lot of thought was put into that decision. Ubuntu is an awesome project to work on and Canonical was a fun and interesting “place” to work, but “all good things must come to an end” so I decided to “quit while I was ahead”. Come up with more clichées if you feel like it. The short story is that I just wasn’t having much fun anymore.

Rackspace came along as an interesting option. I’ve known about them since forever, and they are doing very interesting stuff in the cloud computing area, so it seemed like a natural progression. I had a few interviews and after we overcame some initial difficulties (they’re not that used to having people from Denmark work for them) I started my new job working on Cloud Sites on March 1st.

This does not mean that I’m going to stop working on Ubuntu, though. It’ll just be on my own time and working on a narrower set of things than I have for a while. I also hope to be at UDS (I’ve applied for sponsorship) so that I can meet all my awesome, old colleagues.

Automated regression testing of server packages

Just a quick FYI: I’ve set up some magic to automatically rebuild a set of server packages every day to see if their regression test suites still pass. The current list of source packages:

  • libvirt
  • postgresql-8.3
  • postgresql-8.4
  • mysql-dfsg-5.0
  • mysql-dfsg-5.1
  • openldap
  • php5

If there are other server packages that run their test suites at build time, please let me know so that I can add them to this list.

They packages are uploaded to the ubuntu-server-autotest/regression-test PPA.