OpenStack is open for business

Moments ago Rackspace announced the OpenStack project. Not only is this awesome news in and of itself, it also means that I can finally blog about it :)

The Rackspace’s IaaS offering consists of two parts: Cloud Servers and Cloud Files. Incidentally, OpenStack (so far, at least) has two main components to it: A “compute” compenent called “Nova” and a “storage” component called “Swift”. Swift is the software that runs Rackspace’s Cloud Files today. Nova was initially developed by NASA and is not currently in use at Rackspace, but will eventually replace the existing Cloud Servers platform.

Last week, we held a design summit in Austin, TX, USA, with a bunch of people from companies all around the world who all showed up to see what we were up to and to help out by giving requirements, designing the architecture or write patches. The amount of interest was astounding!

I’m sure others will be blogging at length about all that stuff, so I’d like to touch upon some of the ways in which Nova differs from the alternatives out there. I’ll leave it to someone else to talk about Swift.

  • Nova is written in Python and uses Twisted.
  • Nova is completely open source. There’s no secret sauce. We won’t ever limit functionality or performance so that we can sell you an enterprise edition. It’s all released under the Apache license, so it’s conceivable that some company might write proprietary, for-pay extensions, but it won’t be coming from us. Ever. This is true for Swift as well, by the way.
  • Nova currently uses Redis for its key-value store.
  • Nova can use either LDAP or its key-value store for its user database.
  • Nova currently uses AMQP for messaging, which is the only mechanism with which the different components of Nova communicate.
  • The physical hosts that will run the virtual machines all have a component of Nova running on them. It takes care of setting up disk space and other parts of the virtual machine preparation.
  • It supports the EC2 query API.
  • The Rackspace API is in the works. I expect this will be the basis for the “canonical” API of Nova in the future, but any number of API’s could be supported.

I cannot explain how excited I am about this. Let me know what you think!

Hudson and VMBuilder

Unhappy with the current state of VMBuilder, I recently decided to take a look at Hudson, hoping it can help improve quality going forward. Hudson is a “continuous integration” tool. This means that it’s a tool you use to apply quality control continuously rather than only either when you’re feeling bored or when a release is imminent.

I’ve set up Hudson with a number of jobs:

  • One monitors the the VMBuilder trunk bzr branch. Whenever something changes there, it downloads it, runs pylint on it, runs the unit tests (pylint and unit tests setup with help from a blog post by Joe Heck), and rolls a tarball. Finally it triggers the next job..
  • ..which builds an Ubuntu source package out of it, and triggers the next job..
  • ..which signs and uploads it to the VMBuilder PPA that I recently blogged about..
  • Last, but certainly not least, I’ve set up the very first completely automated, end to end VMBuilder test. It grabs the freshest tarball from Hudson, copies it to a reasonably beefy server, builds a VM, boots it up and upon succesful boot, it reports back that it all worked, and Hudson is happy. It doesn’t exercise all the various plugins of VMBuilder (not even close), but it’s a start!

VMBuilder in Lucid == lots of fail

Let it be no secret that I’m unhappy with the state of VMBuilder in Lucid (and in general for that matter). Way too many regressions crept in and I didn’t have time to fix them all. I still expect to do an SRU for all of this, but every time I try to attack the bureaucracy involved in this, I fail. I need to find a few consecutive hours to throw at this very soon.

Anyways, in an effort to make testing easier, I’ve set up a PPA for VMBuilder.

I’ve set up a cloud server that monitors the VMBuilder trunk bzr branch. If there’s been a new commit, it rolls a tarball, builds a source package out of it, and uploads it to that ppa. That way, adventurous users can grab packages from there and test things out before they go into an SRU. To do this, you simply run this command:

sudo add-apt-repository ppa:vmbuilder/daily

I’m also working on a setup that will automatically test these packages. The idea is to fire up another cloud server, make it install a fresh VMBuilder from that ppa, perform a bunch of tests and report back. To do this, I’m injecting an upstart job into the instance that

  1. adds the ppa,
  2. installs vmbuilder,
  3. builds a VM, which (using the firstboot option) will call back into the host when it has booted succesfully,
  4. sets up a listener waiting for this callback,
  5. waits for set amount of time for this callback.

If I get a response in a timely manner, I assume all is well. If not, it’ll notify me somehow.

The idea is to make it run a whole bunch of builds to attempt to exercise as much of the code base as possible.

I’ll try to make a habit of blogging about the progress on this as I know a lot of people are aggravated by the current state of affairs and this way, they can see that something is happening.

Cloud computing – Same old song?

I recently ended up in a conversation with a guy who turned out also to work in IT. When I mentioned I worked on cloud computing, he started talking about how it was just the same old song. Before I had a chance to reply, we were interrupted, but I haven’t really been able to push this aside, and I’d like to address this point of view, as it’s probably held by others as well.

He said that he found cloud computing to be “old wine in new bottles”. His arguments were almost exclusively about how outsourcing is a bad idea. The rest of the time he spent pointing out that for all the time he’d had an Amazon S3 account (I think he said 2-3 years) he hadn’t noticed a price reduction in spite of the price of self-hosted storage is ever decreasing.

Cloud computing certainly shares some characteristics with outsourcing. You are running services on someone else’s hardware, in their infrastucture, leaving a big chunk of responsibility with this provider. This is also true for cloud computing. It’s also true that you’re paying a premium for the hardware compared to what it would have cost if you had it in your own data center. The difference between CAPEX and OPEX seemed to be lost on him, along with the fact that you’re also freeing human ressources to work on more interesting things, but none of this is really the point.

Apart from sharing the benefits (and drawbacks!) of outsourcing, cloud computing offers a new level and type of dynamism and availability. If you’re just going to take your Exchange server (his example) or whatnot and put it on a statically allocated cloud server, then yes, it’s the same old outsourcing song. If you, however, design your service so that it can scale horizontally, the dynamism of cloud computing will let you scale both up and down to address changes in demand. This way you save money when your service is idling, yet you can scale up quickly to respond to rising demand. More ressources are (supposedly) always available and right at your fingertips. They’re a simple API call away. Leveraged properly, it’s very likely that you could not only save money running the same service in the cloud, but also be able to deal with fluctuations in service demand much better than you could in your own data center or in an old school outsourcing scenario.

As for his other point, about the prices never decreasing in spite of the cost of hosting these things yourself decreases over time.. That’s a good point. He thought that that was how the these providers were really expecting to make money. I wouldn’t go that far at all, though. What makes cloud computing a viable business is by and large the economy of scale. Hosting lots and lots and lots of virtual servers or petabyte upon petabyte of data is lots cheaper /per unit/ than hosting a few servers and a few terabytes of data, but I have to agree that it does seem that the price per GB of stored data should be decreasing over time in response to the decreasing cost of storage on the market.

“I got redirected here from linux2go.dk.. What gives?”

I got fed up with the old site. It was unfocused, unprofessional, not very pretty, out-of-date.. Frankly, I was feeling embarassed about it.

I took it offline completely a couple of weeks ago, expecting to redo it altogether.  While thinking about its future and trying to write a few things for the new web site, I found it more and more awkward to pretend that my company and I were separate entitites. There’s only me in the company. It’s always been that way. I’ve had a few people I’ve known that I could rely on if I got too busy or somehow ended up with assignments with requirements I couldn’t meet, and at some point in the future there might be more people in the company, but for the time being, it’s just me. Realising this and not pretending or attempting to create the illusion that it’s something it’s not makes this whole thing more straightforward.

So, instead of spending a lot of time writing content for a new website, I’ll try to see if a simple blog will serve me well. Welcome.

Switching to Wordpress

For years now, my blog has been powered by my own blogging engine. I wrote my own because I wanted to not have to run PHP on my web server, and it was a handy way to get familiar with Django. However, I now work for a company that, among many other things, offers web hosting, so it seems like a good idea to be dogfooding that, and having one less spare time project to work on is always a win. On top of that, Wordpress seems like a pretty awesome system with an extensive ecosystem of plugins, a stack of client applications, etc. This post, for instance was written almost entirely on my phone in the wicked cool Wordpress application for Android.

Not an April fool’s joke

Today marks the beginning of my second month working for Rackspace.

I’ve realised I haven’t actually blogged about my leaving Canonical, so this post doubles as an announcement about that, I suppose.

A lot of thought was put into that decision. Ubuntu is an awesome project to work on and Canonical was a fun and interesting “place” to work, but “all good things must come to an end” so I decided to “quit while I was ahead”. Come up with more clichées if you feel like it. The short story is that I just wasn’t having much fun anymore.

Rackspace came along as an interesting option. I’ve known about them since forever, and they are doing very interesting stuff in the cloud computing area, so it seemed like a natural progression. I had a few interviews and after we overcame some initial difficulties (they’re not that used to having people from Denmark work for them) I started my new job working on Cloud Sites on March 1st.

This does not mean that I’m going to stop working on Ubuntu, though. It’ll just be on my own time and working on a narrower set of things than I have for a while. I also hope to be at UDS (I’ve applied for sponsorship) so that I can meet all my awesome, old colleagues.

Automated regression testing of server packages

Just a quick FYI: I’ve set up some magic to automatically rebuild a set of server packages every day to see if their regression test suites still pass. The current list of source packages:

  • libvirt
  • postgresql-8.3
  • postgresql-8.4
  • mysql-dfsg-5.0
  • mysql-dfsg-5.1
  • openldap
  • php5

If there are other server packages that run their test suites at build time, please let me know so that I can add them to this list.

They packages are uploaded to the ubuntu-server-autotest/regression-test PPA.

What Ubuntu Server *could* be

I’m glad Thierry started this discussion. About six months ago when we were first beginning to talk about what to do in Jaunty, I sat down and wrote a bunch of notes that I meant to turn into a blog post, but it never made it farther than an e-mail to a few people, but now that we’re sharing visions, I thought I’d post it.

Disclaimer: These are simply notes I wrote for myself. They’re not the outcome of a discussion, it’s not a blessed strategy.. They’re just my notes.

What is our profile? What offsets us from the others?

If I’m brutally honest, I must admit that when I explain Ubuntu server
to people, it very often ends up something like: “Debian with a sane,
predictable release schedule. We take a snapshot of Debian at some
point, and apply some polish and tender loving, and we ship it.” (Note:
I wrote these notes 6 months ago, and this part is not quite true anymore,
but let’s just forget that for a little bit.)

Sure, we also add a few gadgets, gizmos, and widgets, but the type of
user who gets won over by that sort of thing alone is probably not the
kind of user we’re really interested in (in part because they’re
transient… If another distro comes up with another gizmo they suddenly
can’t live without, they’ll be out of here in no time).

We need some kind of profile. We need to do something differently from
others. Offer a different concept. Right now, we’re trying to the others
at their game. I’m not saying it can’t be done, but it’s a veritable
David vs. Goliath.

Debian provides us with a technically strong, dependable base, but
Debian is a solution to a problem we’re not trying to solve.

Ubuntu on the desktop took off with a bang with the Warty Warthog
release.. It was an almost instant success. Why? Because it solved the
problems everyone was facing:

  • Easy to install
    • The install process was boiled down to as few questions as we
      could possibly get away with, in part by leaving out a lot of
      advanced options.
  • Lots of common hardware supported
    • Even restricted drivers. The idea was that a software stack
      consisting of all free software with a single binary blob to
      enable a wifi card or a graphics adapter is better than a software
      stack of all non-free software. For most users, these were (and
      are still) the only two viable choices.
  • A wide selection of software was pre-installed and ready to go.
    • All you needed to do was look around in the menus and you found
      the software you needed to get most of your work done. No need to
      look on the internet for “what software do you use instead of
      Word/Internet Explorer/MSN Messenger/Outlook on Linux?”

Essentially, it was all about “making the best of free software
available”.

Now, is “making it available” still a problem on servers? Yes! Sure,
there’s lots of stuff we can’t do with an Ubuntu Server, but what if we
focus on what you *can* do, and make that very, very available in a way
that’s true to our UNIX heritage?

What would that require?

  • Easy to install
    • What are the common stumbling points for the installation process?
      • Example: Partitioning is difficult. You usually only get the one
        chance to get it right, and if it’s your first linux system, you
        won’t have a clue.
  • How can we fix them?
    • Example: Do their partitioning for them?
      • In ways that don’t limit our choices later on?
        • Example: Always make the disk a raid member where the raid set only
          has that one member. That way, it’s easier to add another
          member later.
        • Example: Always do LVM. Provide tools to easily move parts of the
          filesystem to a newly created logical volume (creating the lv
          and mkfs it in the process).
  • Lots of common hardware supported.
    • What server class hardware out there is unsupported?
    • Do we need to create a restricted driver set for servers?
  • A wide selection of software pre-installed and ready to go.
    • Perhaps not actually pre-installing them, but making sure that
      people are using “the right selection” of software some other way,
      perhas by means of:

      • Better documentation
        • I’ve never read a book about Linux system administration and not
          thought that they were doing it all wrong. This is symptomatic:
          IMO, we’re quite good at pointing out when people are doing things
          wrong, but we fail to go out and define the One True Way[tm] to do
          things. Personally, I’m afraid I’ll make a mistake and people will
          wind up at a dead end, because there’s something, I’ve
          overlooked.
  • Much better integration
    • Again, this stems from our failure to go out and define the One
      True Way[tm] of setting up our services and integrate them. This
      is something we inherited from Debian. I believe it needs to stop
      right now. Take dovecot, for instance.. I don’t expect any half
      serious deployment of dovecot to use the userdb and passdb
      backends that its configured to use by default, yet we leave the
      defaults that way. Why? Because Debian does it. Why do they do it?
      Because they’re trying to solve a different problem than we are.
      They want to provide a platform that unbiased does everything for
      the relatively few people who know how to drive it. This is noble enough, but
      to dovecot, it means that it’s only as enterprise ready as the
      sysadmin can manage to set it up. We need to define what an
      Ubuntu based enterprise environment looks like and offer that in a
      packaged form for easy deployment. The benefits are numerous:

      • Knowing that a company uses an Ubuntu based network
        infrastructure currently tells you nothing. Defining these best
        practices will provide a baseline, that’s recognisable by Ubuntu
        admins everywhere.
      • Hiring is easier (for companies looking for Ubuntu sysadmins).
        If an admin has Ubuntu experience there’s now a chance that
        he’ll actually be able to apply his knowledge directly.
      • Support is much easier when you can actually make assumptions
        about what people are using as their directory server, and how
        everything speaks together, because *we* defined it.
      • It paves the way for an Ubuntu System Administration
        certification.
    • Etc.

What we should offer is:

  • Enterprise readiness out of the box.
    • Well defined interfaces (contracts, if you will) between components.
      • Example: If we were to decide that Ubuntu Server uses an ldap
        backend for storing mail aliases, we’d clearly document the exact
        query that would be run to fetch that info. If a user for
        whatever reason needed to extend the ldap schema, he’s allowed to do
        so and can expect everything to keep working as long as that query
        gives the same result. Likewise, the LDAP DIT will also be
        clearly documented, so that the user is allowed to add custom
        frontends.
      • These contracts follow our freeze process. I think beta freeze
        would be an appropriate time to lock these down.
    • Simple tools (akin to the ones we already have) to manage these
      things. Home users or small businesses shouldn’t suffer because we
      decided to change the way things work.

      • E.g. if we decide to install an LDAP server and use that from nss
        and pam instead of passwd/shadow on each and every Ubuntu Server
        installation, adduser and such should keep working as it always
        has).

Sorry if it’s a bit of a mess, but as you know, perfect is the enemy of good enough.

gtk-vnc and virt-viewer mozilla plugins

Another cool thing that’s new in Jaunty that I’ve never gotten around to bloggin about is the fact that the virt-viewer and gtk-vnc packages in Ubuntu now provide mozilla-virt-viewer and mozilla-gtk-vnc, respectively.

This means you can now put something like

  <embed type="application/x-gtk-vnc"
    width="800"
    height="600"
    host="127.0.0.1" port="5900">
  </embed>

or this:

  <embed type="application/x-virt-viewer"
    width="800"
    height="600"
    uri="qemu:///system" name="something">
  </embed>

in a web page and have access to virtual machines or other VNC servers directly in your browser.

I have a feeling this will spark some rather interesting web based management tools once it becomes more ubiquitous.

Linux2Go is Digg proof thanks to caching by WP Super Cache