Problems worthy of attack prove their worth by hitting back. —Piet Hein

Sunday, 13 October 2013

Five years at Cloudera

Five years ago today was my first day at Cloudera. The team I joined consisted of the four founders—Mike Olson, Amr Awadallah, Jeff Hammerbacher, Christophe Bisciglia—as well as Aaron Kimball who had joined a week or so before, Alex Loddengaard who was working as an intern, and Matei Zaharia who joined on the same day as me as a part-time consultant.

Before I joined I had been working as an independent Apache Hadoop consultant for a year (probably the first Hadoop consultant anywhere), and was halfway through writing a book on Hadoop. The interview process had involved speaking to all four founders, and I remember when I came off the phone after the last call it was late in the UK but I couldn't sleep because the vision they had described was exactly what I wanted to see for Hadoop: a company that wanted to make Hadoop accessible to everyone, by making it easier to use and run, while maintaining a strong commitment to open source. The last point sealed the deal for me, and really at that point there was no way I could not join, and five years on I can say without exaggeration that it was the best decision of my professional life.

When I started I was living in Wales, which meant that on my first day I didn't see any of my new colleagues! That was remedied a few weeks later on when I visited California (and ApacheCon in New Orleans) in early November 2008. Initially the others were working out of a single room in AdMob's offices in San Mateo, but it wasn't long before we moved to a smart brick-lined office in Burlingame. I was around for the moving in day, which involved more flatpack assembly skills than programming.

From the very beginning we worked on making Hadoop easier to use, run, and support, and better integrated with other systems, so that it could enjoy broader adoption. That was borne out in the early projects at Cloudera which included creating training material, creating packages for Red Hat and Debian (CDH, and later Bigtop), writing tools for data ingest (Flume and Sqoop), creating a rich web UI for Hadoop users (Hue), as well as making contributions to the core project. I was mainly involved in the latter, which I did at the same time as completing the book in time for the Hadoop Summit 2009, which would never have been possible without the time and space my teammates gave me.

Over the first year I would visit every three months or so, and naturally each time the team would have grown. I always enjoyed meeting the new people who had joined since my last visit, but I realized that at such a formative time in a company's life, when the culture was being laid down that being closer to the team would make it easier for me to stay involved. The opportunity to move to California came up, and on the last day of October 2009 I arrived in San Francisco with my wife, Eliane, and two girls.

As anyone who has moved to a new country knows, there's a lot of things to sort out—somewhere to live, a school for the girls, reams of paperwork—and during this time the folks at Cloudera were incredibly helpful and supportive. When we moved into our new apartment  (which Eliane had found a mere two weeks after we arrived) half of the engineering team turned up to help with Ikea flatpack assembly.

At the end of our three year sojourn in the US, we left having made many friends, sad to leave, but happy knowing we'd be living closer to our family again. Cloudera was an order of magnitude larger than when I had arrived, and was now an international company with offices in several countries across the world.

Over the last five years I've been lucky enough to have been given the freedom to work on many parts of the Hadoop stack, in different parts of the Hadoop community, and with different teams at Cloudera. In the course of doing so, I've worked with the most talented and intelligent group of people in my life. It's hard work, and challenging, but also a lot of fun and incredibly enriching. I have every reason to expect it to continue. Thanks Cloudera!

Update on October 14: reworded to state that ApacheCon 2008 was held in New Orleans, not California. Thanks to Isabel Drost-Fromm for pointing out the error.

Saturday, 30 March 2013

Making a Kitchen Table


A couple of weeks ago I made a new kitchen table.



It was much easier than it looks as all I had to do was attach some hairpin legs to a worktop. If you haven't seen hairpin legs before, here's a closeup:



Eliane got the idea for the design after seeing something similar on the web, and she ordered the worktop from Worktop Express, and the hairpin legs from the Iron Mill.

I worked out what size screws to use (#12) and the pilot drill size using this handy chart. I also found a tip somewhere that said putting a little wax on the screw makes it easier to drive in with hardwoods (our worktop is oak).

The table is pretty sturdy, and hasn't collapsed! It was quicker to put together than some Ikea furniture, and it's very satisfying having an everyday piece of furniture that we designed and built ourselves.

Sunday, 3 February 2013

Have you put the chickens to bed?

"Have you put the chickens to bed?" -- it's a question we ask each other frequently in our house, since we are the proud owners of seven beautiful hens. Normally Eliane has, but when Lottie, our younger daughter, asked long after it had got dark one evening last week it turned out that none of us had, despite having IFTTT alerts set up to remind us.

The problem with the alert is that it is set to go off at sunset, which is all that IFTTT allows, and that's a bit too early as it's not dark enough for the chickens to be in their house. So we wait a bit, then we forget.

So I decided to write an Android app to send an alert a fixed amount of time (say 45 minutes) after sunset, so that when we received it, it would be dark, the chickens would be in their house, and we could close the door there and then.

This is the result:

Eliane is currently beta testing it, so we'll see how well it works. (Obviously the long term goal is an automatic sensor to open and close the chicken house door, but we're not there yet.)

Writing Android Apps

This is the first Android app I've written, and overall I found the process very straightforward. A couple of years ago I ran a "Hello World" Android tutorial, and I seem to remember most of the time taken to get the app running was installing the Eclipse plugin. This time the Android Developer Tools (ADT) include a customized version of Eclipse, making the getting started process much smoother. 

The Android API is huge and fairly intimidating. It is, however, incredibly well documented, and the user guides are invaluable. The hardest part of writing the app was figuring out which parts of the API to use - do I need a BroadcastReceiver or a Service?, how do AlarmManager and Notification interact? - that kind of thing. There's a lot of material online covering how to do various things in Android, and these offered general pointers, but not necessarily useful code, since the API evolves rapidly from release to release. And although the older code is generally supported, since compatibility is taken very seriously, there may be a better way of doing things in later versions.

The ADT tooling is good and encourages you to do the right thing - for example, extracting natural language strings from your app so it's easy to change them (or translate them) later. In this case, a class called R is generated which has references to all the assets that you need in you app: icons, sound files, strings, etc. For example, the audio file which plays when the notification is received is referred to with:

R.raw.cluck

To generate the icons I drew a chicken on a piece of paper with a sharpie, then took a photo of it and used an online image editor to make the background transparent. The Android Asset Studio completed the job of converting the image to a set of icons. (I didn't use Inkscape in the end, but this blog entry shows how to convert from an Inkscape drawing.) 

What's Next?

The biggest limitation in the app at the moment is that the calculation for sunset time is hardcoded for the UK. Using the Location API is the obvious next step there.

There are also some complications to do with making sure that notifications will still be sent even the phone is rebooted. I want to make sure that works properly before putting the app on Google Play.

The UI is pretty rudimentary too and could do with some work.

And before we get to the fully-automated solution, we could have a sensor that detects if the door is open or closed and only sends the reminder if the door hasn't been closed for the night.

Source is on GitHub.

Monday, 31 December 2012

How far away is the sea?

I wanted an app to answer this question, so I wrote one:



You can try it out at http://how-far-away-is-the-sea.appspot.com/. It works well on phones too, so you can use it when you are out and about.

How does it work?

I used the dataset of land polygons from Natural Earth, which as the name suggests covers the whole world. The scale is 1:10 million, so inevitably there is some inaccuracy near the coast, particularly where it's wiggly.

The app uses your current location (or a location you selected by clicking on the map) and computes the closest point in the set of land polygons. This calculation is performed using the JTS Topology Suite, a library for 2D spatial work, and it runs as a Java webapp hosted on Google App Engine.

Originally I used Geotools to perform the geospatial calculations, but unfortunately it doesn't run on GAE, so I wrote an offline tool to convert the Natural Earth shapefiles to a JTS binary format. JTS works fine on GAE, but it lacks a distance calculator. Luckily spatial4j has the requisite distance functions, and it too works on GAE.

The webapp exposes a simple query endpoint, so a request for the following URL, for example:

http://how-far-away-is-the-sea.appspot.com/query?lat=51.856479&lng=-3.13551

will return a JSON document with the closest point on the coast, whether the (origin) location is on land or at sea, and the distance in metres to the coast:

{
"latitude":51.856479,
"longitude":-3.13551,
"coastLatitude":51.55853913000007,
"coastLongitude":-2.984038865999878,
"onLand":true,
"distanceToCoast":34734.59501052392
}

The page that the user sees is a simple static HTML page that uses the Google Maps API (v3) to render the map and the markers, and jQuery to query the Java webapp.

The complete source code is on Github at https://github.com/tomwhite/how-far-away-is-the-sea.

Further ideas

Some of the polygons are a poor approximation to the coastline, so it would be nice to get a higher-resolution dataset. There are likely many potential sources, such as this one for the UK.

It would be interesting to use the dataset to answer the question: "which is the furthest point from the sea [in the UK/in X/in the world]?". I'd like to find time to do that sometime. Adding in spatial indexes might be helpful too.

If you liked this app then you might like...

Is it day or night?


Sunday, 16 December 2012

IFTTT

IFTTT, pronounced "ift", and which stands for "if this then that", is a great service for wiring bits of the internet together. The idea is that you create rules for performing actions, based on triggers.

If this [trigger] occurs then perform that [action].

There are lots of triggers and actions, provided by channels. For example, the Weather Channel provides a trigger which fires at sunset. And the Google Talk Channel provides an action to send a chat message. I combined the trigger and action into a recipe called "Did you put the chickens to bed?" which will remind me (and Eliane) to close the chicken shed in the evening.

I love the simplicity of the whole thing. I quickly added a recipe to send a weekly SMS to remind me to put the rubbish out. And one to send an email to Lottie when there is a full moon. Emilia created a recipe to send her an email when a friend of hers posts something on his blog. I fear the recipe that tells me when it has started raining will be deleted soon due to email overload.

When you start thinking in this way, the more interesting uses invariably involve the the physical world in some way. I want to have a recipe that says "if we're running out of coffee beans then order some more", or "if I'm on Skype light up a lamp outside my office so the kids know not to come in" (this one is close with the blink(1) device), or even "it's actually dark now and you still haven't closed the chicken shed door".