Tollef Fog Heen's blog

tfheen Fri, 21 Oct 2011 - Today's rant about RPM

Before I start, I'll admit that I'm not a real RPM packager. Maype I'm approaching this from completely the wrong direction, what do I know?

I'm in the process of packaging Varnish 3.0.2 which includes mangling the spec file. The top of the spec file reads:

%define v_rc
%define vd_rc %{?v_rc:-%{?v_rc}}

Apparently, this is not legal, since we're trying to define v_rc as a macro with no body. It's however not possible to directly define it as an empty string which can later be tested on, you have to do something like:

%define v_rc %{nil}
%define vd_rc %{?v_rc:-%{?v_rc}}

Now, this doesn't work correctly either. %{?macro} tests if macro is defined, not whether it's an empty string so instead of two lines, we have to write:

%define v_rc %{nil}
%if 0%{?v_rc} != 0
%define vd_rc %{?v_rc:-%{?v_rc}}
%endif

The 0{?v_rc} != 0 workaround is there so that we don't accidentially end up with == 0 which would be a syntax error.

I think having four lines like that is pretty ugly, so I looked for a workaround and figured that, ok, I'll just rewrite every use of %{vd_rc} to %{?v_rc:-%{?v_rc}}. There are only a couple, so the damage is limited. Also, I'd then just comment out the v_rc definition, since that makes it clear what you should uncomment to have a release candidate version.

In my naivety, I tried:

# %define v_rc ""

# is used as a comment character in spec files, but apparently not for defines. The define was still processed and the build process stopped pretty quickly.

Luckily, doing # % define "" seems to work fine and is not processed. I have no idea how people put up with this or if I'm doing something very wrong. Feel free to point me at a better way of doing this, of course.

[16:04] | tech | Today's rant about RPM

tfheen Wed, 05 Oct 2011 - The SugarCRM rest interface

We use SugarCRM at work and I've complained about its not-very-RESTy REST interface. John Mertic a (the?) SugarCRM Community Manager asked me about what problems I'd had (apart from its lack of RESTfulness) and I said I'd write a blog post about it.

In our case, the REST interface is used to integrate Sugar and RT so we get a link in both interfaces to jump from opportunities to the corresponding RT ticket (and back again). This should be a fairly trivial exercise or so you would think.

The problems, as I see it are:

My first gripe is the complete lack of REST in the URLs. Everything is just sent to https://sugar/service/v2/rest.php. Usually a POST, but sometimes a GET. It's not documented what to use where.

The POST parameters we send when logging in are:

method=>"login"
input_type=>"JSON"
response_type=>"JSON"
rest_data=>json($params)

$params is a hash as follows:

user_auth => {
            user_name => $USERNAME,
            password => $PW,
            version => "1.2",
},
application => "foo",

Nothing seems to actually care about the value of application, nor about the user_auth.version value. The password is the md5 of the actual password, hex encoded. I'm not sure why it is, as this adds absolutely no security, but it is. This is also not properly documented.

This gives us a JSON object back with a somewhat haphazard selection of attributes (reformatted here for readability):

{
     "id":"<hex session id>,
     "module_name":"Users",
     "name_value_list": {
             "user_id": {
                     "name":"user_id",
                     "value":"1"
             },
             "user_name": {
                     "name":"user_name",
                     "value":"<username>"
             },
             "user_language": {
                     "name":"user_language",
                     "value":"en_us"
             },
             "user_currency_id": {
                     "name":"user_currency_id",
                 "value":"-99"
             },
             "user_currency_name": {
                     "name":"user_currency_name",
                     "value":"Euro"
             }
     }
}

What is the module_name? No real idea. In general, when you get back an id and a module_name field, it tells you that the id exists is an object that exists in the context of the given module. Not here, since the session id is not a user.

The worst here is the name_value_list concept which is used all over the REST interface. First, it's not a list, it's a hash. Secondly, I have no idea what would be wrong by just using keys directly in the top level object, so the object would have looked somewhat like:

{
     "id":"<hex session id>,
     "user_id": 1,
     "user_name": "<username>,
     "user_language":"en_us",
     "user_currency_id": "-99",
     "user_currency_name": "Euro"
}

Some people might argue that since you can have custom field names this can cause clashes. Except, it can't, since they're all suffixed with _c.

So we're now logged in and can fetch all opportunities. This we do by posting:

method=>"get_entry_list",
input_type=>"JSON",
response_type=>"JSON",
rest_data=>to_json([
            $sid,
            $module,
            $where,
            "",
            $next,
            $fields,
            $links,
            1000
])

Why this is a list rather than a hash? Again, I don't know. A hash would make more sense to me.

The resulting JSON looks like:

{
    "result_count" : 16,
    "relationship_list" : [],
    "entry_list" : [
       {
          "name_value_list" : {
             "rt_status_c" : {
                "value" : "resolved",
                "name" : "rt_status_c"
             },
             […]
          },
          "module_name" : "Opportunities",
          "id" : "<entry_uuid>"
       },
       […]
    ],
    "next_offset" : 16
}

Now, entry_list actually is a list here, which is good and all, but there's still the annoying name_value_list concept.

Last, we want to update the record in Sugar, to do this we do:

method=>"set_entry",
input_type=>"JSON",
response_type=>"JSON",
rest_data=>to_json([
    $sid,
    "Opportunities",
    $fields
])

$fields is not a name_value_list, but instead is:

{
    "rt_status_c" : "resolved",
    "id" : "<status text>"
}

Why this works and my attempts at using a proper name_value_list didn't work? I have no idea.

I think that pretty much sums it up. I'm sure there are other problems in there (such as the over 100 lines of support code for the about 20 lines of actual code that does useful work), though.

[09:40] | tech | The SugarCRM rest interface

tfheen Wed, 31 Aug 2011 - Bizarre slapd (and gnutls) failures

Just this morning, I was setting up TLS on a LDAP host, but slapd refused to start afterwards with a bizarre error message:

TLS init def ctx failed: -207

The key and certificate was freshly generated using openssl on my laptop (running wheezy, so OpenSSL 1.0.0d-3). After a bit of googling, I discovered that -207 is gnutls-esque for "Base64 error". Of course, the key looks just fine and decodes fine using base64, openssl base64 and even gnutls's own certtool.

Now, certtool also spits out what it considers the right base64 version of the key and I noticed it differed. Using the one certtool output seems to work, though, so if you ever run into this problem try running the key through certtool --infile foo.pem -k and use the base64 representation it outputs.

[10:30] | tech | Bizarre slapd (and gnutls) failures

tfheen Wed, 03 Aug 2011 - libvmod_curl – using cURL from inside Varnish Cache

It's sometimes necessary to be able to access HTTP resources from inside VCL. Some use cases include authentication or authorization where a service validates a token and then tell Varnish whether to proceed or not.

To do this, we recently implemented libvmod_curl which is a set of cURL bindings for VCL so you can fetch remote resource easily. HTTP would be the usual method, but cURL also supports other protocols such as LDAP or POP3.

The API is very simple, to use it you would do something like:

require curl;

sub vcl_recv {
    curl.fetch("http://authserver/validate?key=" + regsub(req.url, ".*key=([a-z0-9]+), "\1"));
    if (curl.status() != 200) {
        error 403 "Go away";
    }
}

Other methods you can use are curl.header(headername) to get the contents of a given header and curl.body() to get the body of the response. See the README file in the source for more information.

[11:44] | tech | libvmod_curl – using cURL from inside Varnish Cache

tfheen Sat, 23 Jul 2011 - Keep calm and carry on.

We will not be consumed by hate. We will not restrict fundamental freedoms, nor become a surveillance state.

We will keep calm and carry on. We will grieve for those lost and hurt in this terrible tragedy.

[21:16] | life | Keep calm and carry on.

tfheen Sat, 21 May 2011 - Upgrading Alioth

A while ago, we got another machine for hosting Alioth and so we started thinking about how to use that machine. It's a used machine and not massively faster than the current hardware, so just moving everything over wouldn't actually get us that much of a performance upgrade.

However, Alioth is using FusionForge, which is supposed to be able to run on a cluster of machines. After all, this was originally built for SourceForge.net, which certainly does not run on a single host. So, a split of services is what we'll do.

This weekend, we're having a sprint in Collabora's office in Cambridge, actually implementing the split and doing a bit of general planning for the future.

Last afternoon (Friday), European time, we started the migration. The first step is to move all the data off the Xen guest on wagner, where Alioth is currently hosted. This finished a few minutes ago; it turns out syncing about 8.5 million files across almost 400G of data takes a little while.

The new host is called vasks and will host the database, run the main apache and be the canonical location for the various SCM repositories.

We are not decomissioning wagner, but it'll be reinstalled without Xen or other virtualisation which should help performance a bit. It'll host everything that has lower performance requirements such as cron jobs, mailing lists and so on.

I'll try to keep you all updated and feel free to drop by #alioth on irc.debian.org if you have any questions.

[11:14] | Debian | Upgrading Alioth

tfheen Tue, 30 Nov 2010 - My Varnish is leaking memory

Every so often, we get bug reports about Varnish leaking memory. People have told Varnish to use 20 gigabytes for cache and they discover the process is eating 30 gigabytes of memory and they get confused about what's going on. So, let's take a look.

First, a little bit of history. Varnish 2.0 had a fixed per-object workspace which was used for both header manipulations in vcl_fetch as well as for storing the headers of the object when vcl_fetch was done. The default size of this workspace was 8k. If we assume an average object size of 20k, that is almost 1/3 of the store being overhead.

With 2.1, this changed. First, vcl_fetch doesn't have obj any longer, it only has beresp which is the backend response. At the end of vcl_fetch, the headers and other relevant bits of the backend response are copied into an object. This means we no longer have a fixed overhead, we use what we need. Of course, we're still subject to malloc's whims when it comes to page sizes and how it actually allocates memory.

Less overhead means more objects in the store. More objects in the store, means, everything else being equal, more overhead outside the store (for the hash buckets or critbit tree and other structs). This is where lots of people get confused, since what they see is just Varnish consuming more memory. When moving from 2.0 to 2.1, people should lower their cache size. How much depends on the amount of objects they have, but if they have many and small objects, a significant reduction might be needed. For a machine dedicated to Varnish, we usually recommend making the cache size be 70-75% of the memory of the machine.

A reasonable question to ask at this point is what all this overhead is being used for. Part of it is a per-thread overhead. Linux has a 10MB stack size by default, but luckily, most of it isn't allocated, so it only counts against virtual, not resident memory. In addition, we have a hash algorithm which has overhead and the headers from the objects are stored in the object itself and not in the stevedore (object store). Last, but by no means least, we usually see an overhead of around 1k per object, but I have seen up to somewhere above 2k. This doesn't sound like much, but when you're looking at servers with 10 million objects, 1k of overhead means 10 gigabytes of total overhead, leading to the confusion I talked about at the start.

[14:03] | varnish | My Varnish is leaking memory

tfheen Tue, 02 Nov 2010 - Temperature logging with 1-wire

Last night, I finally got my temperature sensors going, including a nice and shiny munin plugin giving me pretty graphs. So far, I only have a sensor in the loft, but I'll spend some days putting sensors in the rest of the house as well.

Robert McQueen asked me on twitter how this all was set up, so I figured I'd blog about it. The sensors I'm using are the DS18B20 ones from Dallas Semiconductor. You can probably buy them from your local electronics supplier, but mine charges around 75 NOK a piece, so I just bought some off Ebay. It takes a bit longer, but I paid about 1/10th the price.

For logging, I'm using my NAS, which is just a machine running Debian, an USB to serial adapter and an serial-to-1-wire adapter. Thanks a lot to Martin Bergek for the writeup and the ELFA part numbers for diodes.

Since I'm lazy, I ended up just writing a plugin for munin. It uses owfs, which I downloaded from mentors.debian.net. I also offered sponsorship for it, assuming a few small issues are cleaned up, so hopefully you can install using just Debian in the near future.

owfs is fairly easy to work with, and the plugin uses the aliased names if you provide aliases, so you can know what the temperature in a given location is, rather than having to remember 64 bit serial numbers.

[07:22] | tech | Temperature logging with 1-wire

tfheen Tue, 14 Sep 2010 - First impressions of the Kenwood AT641

I recently got my hands on a Kenwood AT641, a fruit juicer attachment for the Chef/Major series of kitchen machines, and now I've had the pleasure of actually using it.

The AT641 is a high-speed, rotational juicer which works by the principle of making a puree of the apples (or whatever else you're juicing), using a spinning plate with sharp studs on it, and then accelerating the puree against a cone-shaped piece of metal with small slots in it, working somewhat like a sieve. The juice drips down and is collected into a jug, the meaty bits of the apple is sent up and out into a small container for the bits that are thrown away.

It works reasonably well, the apple chute is quite large, so only large apples need to be cut in two, and none of the apples in the bucket I was testing with needed to be cut in more than two, and the apple juice I got out was nice and smooth, yet had some apply bits in it. It's not clear, but that's the way I prefer it, you can filter it later if you prefer clear juices. The build quality seems quite good with sturdy metal parts and thick plastics. It's easy to dismantle once you've done it once, as there's a trick to remove some of the parts.

On the downside, I had problems with it not managing to throw all the residual bits into the garbage container. They stuck to the to plastic above the metal cone and ended up clogging. This might be due to using the wrong kind of apples or something odd like that, but it was nevertheless a bit disappointing. I hope it will work better on my next batch. Cleaning the juicer requires dismantling it completely (which is done without any tools), and is fairly easy, except for some crooks that are hard to clean properly, especially given you can't inspect them visually.

The whole process was fairly painless, including gathering an overfull bucket of apples, I spent an hour and a half making almost four litres of delicious apple juice.

All in all, I'm reasonably happy with the buy and hope my clogging problems are just a fluke.

[22:28] | life | First impressions of the Kenwood AT641

tfheen Mon, 05 Apr 2010 - The Bridge of Allan, Ben Nevis

Ben Nevis is the tallest mountain on the British Isles and also the name of one of the beers that the Bridge of Allan brewery makes. We visited the brewery around the start of the year. One of the beers we brought back was a Ben Nevis, described as a ruby red IPA. I'd classify it as more of a dark amber or brown ale. The taste is quite hoppy without being too bitter and with a fair amount of malt. A bit of fizz for a British beer, but I quite like it that way. All in all, a good and drinkable brown ale.

[19:45] | beer | The Bridge of Allan, Ben Nevis

Tollef Fog Heen <tfheen@err.no>