Author Archives: Soren

OpenStack design tenets

Before OpenStack even had a name, it had its basic design tenets. The wiki history reveals that Rick wrote these down as early as May 2010, two months before OpenStack was officially launched. Let’s take a look at them:

  1. Scalability and elasticity are our main goals
  2. Any feature that limits our main goals must be optional
  3. Everything should be asynchronous
    • a) If you can’t do something asynchronously, see #2
  4. All required components must be horizontally scalable
  5. Always use shared nothing architecture (SN) or sharding
    • a) If you can’t Share nothing/shard, see #2
  6. Distribute everything
    • a) Especially logic. Move logic to where state naturally exists.
  7. Accept eventual consistency and use it where it is appropriate.
  8. Test everything.
    • a) We require tests with submitted code. (We will help you if you need it)

Now go and look at every single OpenStack diagram of Nova ever presented. Either they look something like this:

Nova diagram

or they’re lying.

Let’s focus our attention for a minute on the little thing in the middle labeled “nova database”. It’s immediately obvious that this is a shared component. That means tenet 5 (“Always use shared nothing architecture (SN) or sharding“) is out the window.

Back in 2010, the shared database was Redis, but since the redisectomy, it’s been MySQL or PostgreSQL (through SQLAlchemy). MySQL and PostgreSQL are ACID compliant, the very opposite of eventually consistent (bye bye, tenet 7). They’re wicked fast and scale very, very well. Vertically. Adios, tenet 4.

Ok, so what’s the score?

Tenet 1: Scalability and elasticity are our main goals.

Tenet 2: Any feature that limits our main goals must be optional

Tenet 3: Everything should be asynchronous

Tenet 4: All required components must be horizontally scalable

Tenet 5: Always use shared nothing architecture or sharding

Tenet 6: Distribute everything (Especially logic. Move logic to where state naturally exists).

Tenet 7: Accept eventual consistency and use it where it is appropriate.

Tenet 8: Test everything.

Is everything synchronous? Hardly. I see 258 instances of RPC call (synchronous RPC methods) vs. 133 instances of RPC cast (asynchronous RPC methods). How often each are called is anybody’s guess, but clearly there’s a fair amount of synchronous stuff going on. Sayonara, tenet 3.

Is everything distributed? No. No, it’s not. Where does the knowledge of individual compute nodes’s capacity for accepting new instances naturally exist? On the compute node itself. Where is the decision made about which compute node should run a new instance? In nova-scheduler. Sure, the scheduler is actually a scale-out internal service in the sense that there could be any number of them, but it’s making decisions on other components’s behalf. Tschüß, tenet 6.

Are we testing everything? Barely. Nova’s most recent test coverage percentage at the time of this writing is 83%. It’s much better than it once was, but there’s still a ways to go up to 100%. Adieu, tenet 8.

We can’t really live without a database, nor a scheduler, so auf wiedersehen tenet 2.

We’re left with:

Tenet 1: Scalability and elasticity are our main goals.

Tenet 2: Any feature that limits our main goals must be optional

Tenet 3: Everything should be asynchronous

Tenet 4: All required components must be horizontally scalable

Tenet 5: Always use shared nothing architecture or sharding

Tenet 6: Distribute everything (Especially logic. Move logic to where state naturally exists).

Tenet 7: Accept eventual consistency and use it where it is appropriate.

Tenet 8: Test everything.

So, the question the remains: With all the above in mind, is scalability and elasticity *really* still our main goals?

On productivity – Part I

I’ve been trying for literally years to really get Getting Things Done under my skin. I’ve read the book several times, each time gaining new insights and for a while inching towards actually using it. For some reason, I always fail at it. I’ve never really worked out why. It all makes perfect sense. I believe it’s a fantastic system, but I just can’t seem to internalise the process.

This weekend, I stumbled upon a post on Milo Casagrande’s blog where he mentioned the Pomodoro Technique. I’ve always enjoyed reading articles etc. on productivity, but somehow never heard about the Pomodoro Technique.

I read the paper and it’s a delightfully straightforward system. It explains clearly, with examples and everything, how you can use the Pomodoro Technique.

The Getting Things Done book talks a lot about concepts and process in very generic terms and (intentionally) avoids imposing tools on the readers. I understand the motivation, but I think it’s misguided. Whenever I meet someone who practices GTD, I always try to get them to explain as much as possible about the practical implementation, their choice of tools, etc, because without this, it’s hard to really get it started. I’ve spent a *lot* of time trying to find good tools, writing tools, etc., but I always wind up with something that doesn’t really work for me, so I was really excited to learn about another, popular productivity system.

I’m going to try some techniques out this week to see if I can combine GTD and the Pomodoro Technique somehow. GTD outlines some excellent concepts for organising your action items, reference material, keeping track of things you’re waiting for others to complete, as well as some very useful ways to review how well the stuff you’re doing hour by hour aligns with your short, medium, and long term goals, but — for me at least — falls short with respect to helping me actually getting started with something. Ironic, really, for a  system called “Getting Things Done”, but that’s a different story.

Facepalm

All the things wrong with monitoring today – Part 3

Erk. I found this sitting around as a draft:

Today’s normality is tomorrow’s abnormality

Last time, we looked at a disk usage graph. This week, we’ll look at CPU usage or something else that goes up and down instead of just up, up, up.

What’s the problem here? The problem is that it’s very hard to set up an alert for this. Some things are simply spiky by nature. Sometimes, that’s perfectly fine. Perhaps the load on this particular application is evenly distributed throughout the day, but at night it runs a bunch of batch processing jobs that pegs the CPU for a couple of hours. For this sort of thing, you have a couple of options in terms of monitoring/alerting.

  • Don’t monitor CPU load.
  • Accept being alerted about this every single night.
  • Ignore CPU load during the time of day when this job runs.

All of these options suck.

  • You can’t just not monitor the CPU load. If you’re suddenly at 100% for an hour during the day, something’s wrong!
  • You don’t want to be alerted by something that is normal. That’s silly. You want your monitoring system to only alert you about stuff that’s worth waking up over.
  • Ignoring the CPU load based on the time of day is a step in the right direction, but this is not an isolated case. You probably have many different services, all with different usage patterns. I also don’t really want to think about what it will do to your configuration files if you had to specify different thresholds for every hour of the day (and every day of the week, etc).

Think about that last option a bit.. What would you use to define expected/acceptable levels? Pure guesswork? Of course not. You’ll use the data you already have. Maybe you’ve run this for a while and have cute graphs that can tell you what is expected. But seriously… Looking at graphs from your monitoring system and using them to type configuration back into your monitoring system? That’s the most ridiculous thing I’ve ever heard (yes, I should probably get out more).

Why can’t the monitoring system just tell me when something is out of the ordinary? It has all the data in the world to make that call. If a metric is unusual for that time of day, on that day of week, at that time of year, let me know. If it’s very unusual, send me a text message. Otherwise, I probably don’t care.

Facepalm

All the things wrong with monitoring today – Part 2

I took much longer to post this than intended. Not that I expect anyone to have been sitting on the edge of their chair waiting for it, but still…

It’s been more than month since my last post, and not a darned thing has changed. Monitoring today still sucks. In the last installment I ranted and moaned about “active” monitoring and how there’s all this information you’re not collecting that is being lost. This time I’ll becry the sorry handling of the data we actually do collect.

Temporal tunnelvision

Let’s for the sake of this argument pretend that “pinging” a web service is actually a useful thing to do. A typical scenario is this: A monitoring server tries to fetch some URL. If it takes less than a second to respond, it’s considered UP (typically resulting in a calming green row in the monitoring system). Kinda like this:

If it’s more than a second, but less than say 5 seconds, it’s considered WARNING (typically indicated by a yellow row in the monitoring system), or if it hasn’t responded within 5 seconds, it’s considered DOWN (resulting in a red row).

Transitions between states often result in an alert being sent out. These alerts typically contain the actual data point that triggered the transition:

"Oh, noes! HTTP changed state to WARNING. It took 1.455 seconds to respond."

It’s sad really, but the data point mentioned in the alert and the most recent you can see in the monitoring system’s web UI are often the only “record” of these data points. “Sad? Who cares? It’s all in the past!”.. *sigh* No. A wise man once said “those who ignore history are doomed to get bitten in the arse by it at some point” (paraphrasing ever so slightly). Here’s why:

Let’s look at a typical disk usage graph:


Sure, your graphs may be slightly bumpier, but this is basically how these things look. It doesn’t take a ph.d. in statistics to figure out where that blue line is headed (towards the red area, if you hadn’t worked it out).

Say that that’s a graph for the last week. If you imagine you’re extending the line, you can see that the disks will be full in about another week and within the red area just a couple of days from now. Yikes.

The point here is that if you were limited by the temporal tunnelvision of today’s monitoring systems, all you’d have seen was a green row all along. You’d think everything was fine until it suddenly wasn’t. Sadly, lots of people happily ignore this information on a daily basis. Even if they actually do collect this information and make pretty graphs out of it, it’s not something you go and look at very often to see these trends. It’s used mostly as a debugging tool after the fact (“Oh, I just got an alert that the disk on server X is running full… Yup, the graph confirms it.”).

I’m not advocating spending all your precious time sifting through graphs, looking for metrics on a collision course with disaster. Sure, if you only have a few servers, it’s not that big of a deal to look at the disk usage graphs every couple of days and see where they’re headed. If you have a thousand servers, though, it’s a pretty big deal.

So what am I advocating? I want a monitoring system that doesn’t just tell me when a disk has entered the yellow area on the shit-is-about-to-hit-the-fan-o-meter. I want a monitoring system that tells me when the next filesystem is likely to enter the yellow area on said meter. See? Instead of a “current problems list”, I want a “These are the next problems you’re likely to have to deal with” list. I want it to feed into my calendar, so that I don’t accidentally schedule a day off on the same day /var on my db server is going to run full. I want it to automatically add a TODO item to my Remember the Milk account telling me to buy more/bigger drives for my file server.

It shouldn’t be that hard!

Facepalm

All the things wrong with monitoring today – Part 1

Monitoring today sucks. Big time. It sucks so bad, it’s not even funny. The amount of time spent configuring stuff, dealing with problems when it’s already too late, and the amount of things your monitoring system could be monitoring, but isn’t, are all staggering. I’ll be spending a couple of posts whining about this. Who knows? Maybe I have a solution up my sleeve by the end of it.

Active vs. passive monitoring

Active monitoring.. It sounds cool. Way cooler than passive. Most of the time, if you have a choice between an active and a passive something, you go with the active one, right? Well, not this time.

The amount of times I’ve seen people set up their monitoring system to access an HTTP URL especially crafted to be useless, to simply respond to the probe as quickly as possible, is ridiculous. It’s surely active, but it’s almost entirely useless. Sure, if this is a service noone uses, it’s probably fine, but if this is a service that has almost any sort of real world use, in the customary 5 minutes between each of these “pings”, there will have been dozens, scores, if not hundreds or thousands of actual requests. Requests that actually did something. Exercised your service at least to some extent. Sadly, this information is almost universally ignored.

Telling Apache to log the amount of time it took to serve a request is trivial. Collecting this information is trivial. Feeding that data to your monitoring system (if not on a per-request basis, just a maximum request time over the last 10 seconds would be a vast improvement) really shouldn’t be too hard. So why don’t you?

Moving on..

Moving on..

Seeing as the election for the OpenStack Project Policy Board is going on, it seems only fair to announce that I quite soon no longer will be working for Rackspace. Instead, I will be working (still on OpenStack) for Nebula. If this is material to your vote, I apologise for not disclosing this earlier, but it simply wasn’t finalised until a bit earlier this week.

All clear!

Testing of OpenStack

I’d like to take a couple of minutes of your time to talk about testing of OpenStack. Swift has always had very good test coverage, and Glance also does pretty well, so I’ll mostly be focused on Nova.

(Psst… If you can’t be bothered to read the whole thing, just skip down to the how you can help section.)

Unit tests

Unit tests are by far the easiest to run. They’re right there in the development tree, a simple ./run_tests.sh away. You don’t need a complicated hardware setup, just a source code checkout.

They each exercise a small portion of the code in isolation to verify that they live up to their “contract”. More often than not, this contract is implicit. There’s no documentation of its input, output, or side effects, and maybe there doesn’t have to be. In many cases things get split up simply for readability reasons (larger routines that have grown out of control get split into smaller chunks) or to ease testing, so they’re not actually written expecting to be called from anywhere else. Documentation for all these things would be *awesome*, but a unit test should be the minimum required.

Functional tests

Unit tests are great. However, verifying that each piece of the puzzle does what it says on the tin is of little use if putting them all together doesn’t actually do what you set out to achieve. This is where we use functional tests. An example might be verifying that when you invoke a frontend API method that is supposed to start a virtual machine, a virtual machine actually ends up getting started in a mock hypervisor with all the correct things having been put in place along the way.

In my experience, almost every time an issue is caught by this type of test, it’s an indication that the unit tests are either wrong (e.g. when X goes into a particular routine, it checks that Y comes out, but for everything else to work, Z was actually supposed to come out)  or don’t test all the edge cases. So, while a failure at this level should probably involve fixing up (or adding new) unit tests, these tests are indispensable. They verify the cooperation between the various internals, which is easy to miss when staring at each tiny little part in isolation (particularly in a piece of software like Nova that is full of side effects).

(In Nova, functional and unit tests all live in the same test suite)

Integration tests

Unit and black box tests are great, but what end users see is what really matters. If someone deploys all the various OpenStack components and put them together and something ultimately doesn’t work, we’ve failed. It’s all been futile.

Integration tests are often the easiest to write. When dealing with internals, it’s easy to punt on a lot of things like “should this method take this or that as an argument?,” “ideally, this db call shouldn’t live here, but it’ll have to do for now,” etc., but when it comes to what the end users sees, everything must have an answer. We can’t not have firm, concrete, simple, long-lived answers to questions like: “If I want to start a virtual machine, what do I do?,” “which argument comes first for this API call?,” etc. Hence, writing tests that start a virtual machine and then later makes sure that it started properly is rather forgiving. It’s also reassuring to end-users to know that their exact use cases are verified to work.

Again, ideally nothing should ever be caught here. If it does, it means that something slipped through a crack left by both the unit tests and black-box tests, or maybe the real KVM doesn’t act like we expected when we wrote its mock counterpart. Everything caught here should end up in a unit test somewhere once the culprit has been found.

Where do we stand today?

Unit and functional tests

As mentioned, nova’s source tree includes a test suite, comprised of both unit and functional tests. We have a Jenkins job that tracks how much of Nova is being exercised by the test suite. At the time of this writing, we have around 74% coverage. Bear in mind that if a particular line is exercised by either a unit test or a functional test (or both, of course). At our last design summit, we agreed that we’d work on improving this coverage, but clearly there’s a long way to go (that number should be in the (very) high nineties).

Integration tests

As for integration tests, there are a number of separate efforts:

Where are we going? (a.k.a. how you can help)

Unit and functional tests

I think this is easily where we have the most work to do. Jenkins keeps track of what is covered and what isn’t:

There’s clearly lots of room for improvement. I’d like to encourage anyone who cares about QA to grab a random bit of code that isn’t yet covered by tests and add a test for it. Feel free to start with anything small and seemingly insignificant. We need to get the ball rolling. Small changes also makes the review easier.

I’ve started going through our coverage report and filing bugs about missing unit tests. Some are just a few simple statements that need tests, others are entire modules that are almost testless. Take a look and feel free to get in touch if you need help getting started.

Integration tests

Over the next month or so, we’re hoping to collect all these efforts (and any others out there, so please let me know!) into one. The goal is to have a common set of tests that we can run against an OpenStack intallation (i.e. all the various components that make up an actual deployment) to get early warning if something should break in a particular configuration. So, if you have anything set up to automatically test OpenStack, please get in touch. If there’s a particular configuration you care about, we want to make sure we don’t break it, so we need your help finding a good way to deploy bleeding edge OpenStack code onto your test installation and run a bunch of tests against it.

PPA management tools

We use PPA’s quite heavily in OpenStack. Each of the core projects have a trunk PPA and a milestone-proposed PPA. Every commit to our bzr trunk branch results in an upload to the trunk PPA, and every commit to our milestone-proposed bzr branch results in an upload to (you guessed it) the milestone-proposed PPA. Additionally, we have a common openstack-release PPA for each of our major releases, where we combine all the projects into one PPA, for simpler distribution.

This poses a number of challenges.

We support every Ubuntu release since Lucid, but most of them lack new enough versions of various stuff (and in some cases, the packages are missing altogether). This means we backport a bunch of things to the various trunk PPA’s, and at the right moments we need to copy all these dependencies either from the trunk PPA to the milestone-proposed PPA (when we branch off for a new milestone) or from the milestone-proposed PPA to the common release PPA (at final release time).

This used to involve a lot of mucking around with Launchpad’s web UI which is not only boring and tedious (checking half a bajillion boxes is even less fun than it sounds), but also error prone, since it’s all manual.

I decided to write a number of tools to help make this simpler. So far, these tools are:

  • copy-ppa-pkg.py

    Simply copies a package from one PPA to another.

  • detect_ppa_mismatches.py

    This one takes a number of PPA’s as arguments, and finds packages that exist in more than one of them, but at different versions. During the development cycle, this is not much of a problem since most people only run the trunk version of a single project, but when we shove them all together in one great, big PPA, it could mean that one of the projects suddenly is being run against another version of one of its dependencies than during the dev cycle.

  • sync-ppas.py

    This one takes all the packages from one PPA and copies them to another and removes stuff from the destination PPA that’s been removed from the source PPA. It’s handy if have a PPA with all your stuff in it, it’s all been QA’ed together and is in good shape, and you want to sync it all over into a “stable” PPA in one fell swoop.

  • list-ppa.py

    Lists the contents of a PPA. Simple as that.

I’ve branched lp:ubuntu-archive-tools and added these tools to lp:~openstack-release/ubuntu-archive-tools/openstack. I can’t really decide if I think they belong inlp:ubuntu-archive-tools, but if someone else wants them I can look into getting them merged back.

New GPG key – Please help :)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

(Thanks to Colin Watson for the template for this post)

I've finally gotten around to setting up a new, strong (4096 bit) RSA-
based GPG-key, and will be transitioning away from my old 1024 bit DSA
key. The old key will continue to be valid for some time, but I prefer
all future correspondence to use the new one. I would also like to
ensure that this new key is well-integrated into the web of trust. This
message is signed by both keys to certify the transition.

The old DSA key was:

pub   1024D/E8BDA4E3 2002-02-22
      Key fingerprint = 196A 89ED 78F3 9047 2A36  F327 A278 DF5E E8BD A4E3

The new RSA key is:

pub   4096R/9EAAF9C5 2011-06-15
      Key fingerprint = E6BC C692 3553 A464 8514  28D1 EE67 E7D3 9EAA F9C5

To fetch my new key from a public key server, you can run:

  gpg --keyserver subkeys.pgp.net --recv-keys 9EAAF9C5

If you already know my old key, you can now verify that the new key is
signed by the old one:

  gpg --check-sigs 9EAAF9C5

If you don't already know my old key, or if you're extra-paranoid, you
can check the fingerprint against the one given above:

  gpg --fingerprint 9EAAF9C5

If you have previously signed my old DSA key, and if you're satisfied
that you've got the correct new RSA key, then I'd appreciate it if you
would sign my new key as well:

  caff 9EAAF9C5

The caff program is in the signing-party package in Debian and its
derivatives, including Ubuntu. Please be careful to generate signatures
that don't rely on the weakening SHA-1 hash algorithm, which requires
some careful configuration even if you've already configured gpg
correctly. See http://www.gag.com/bdale/blog/posts/Strong_Keys.html for
the gory details.

Thanks,
Soren Hansen
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)

iQIcBAEBCgAGBQJN+KNRAAoJEO5n59OeqvnFibwP/jLC+VXFBxMbxqxcXTolr19H
5octVasjlO7Ub9oqH/7/Js168g5gnzSefraWTd0kELc3fXVVsvZuNUtfBhrSYMX5
EQnRr+P1HGi5apFECc3hd7dNQIfbypkQ4UAK3T2nYgkYLATnt04CX0LUz/wnvHJD
fnNRHcM0A8pTn9oJB6LgXpJUspz3pqTFyEoQTkY0/QPcfLbeTLqYG+slSp8+I35H
I+PXl4XrbSsbcJTjpRRllodb1d5sFYZr827ZqPksEeiozGfwpXLZ/DtaIrtE3z3T
AVPCeG/9VCFtcvgqPQnhcbsS6RrGVkE5fUFxgZzERlAxAkkPi+WhwinASmkvOtmE
0m6fkhEeMCYvqvrDoeR8mZvgODIZjP7aIvNKDpWBA9mxC7k171LHdnsluB3xN0sH
++8/w5ESy7GpFxveLk6jR5ytfTxVLUgAASoqJbsxpMqSz/5KNompdFy/Hu13PVel
afqQNfMLjV0QXrKvtmmPSbUs6bWhxwE04jsYAUQcFNBFyHMmYQdA5peikC8ad+JL
WJocQmRUeE6EVRKKSaBXJIihcRHigeTf+6qqaSdbTpeZ1iPSMyETmOW8ZzBaQ7F0
VO65BOOzFsD7Cuaxba427CFPvYo5F4Bi3Dtuwz1PtZlNKExFRuNZ4vhs6dRFenJT
kkeIQrFpJ6wF/DVrutMeiEYEAREKAAYFAk34o1EACgkQonjfXui9pOOVyACfeLci
yfBLmY3L9Abcmg6ggCVQLBAAoKDQfzhmoK5mk26dQToReFpJ80bq
=0yN5
-----END PGP SIGNATURE-----