Approaching Scalability, and Building a Biz on Hadoop, HBase, and Open Source Distributed Computing
The slides for my OSCON 2009 presentation, “Approaching Scalability: Building a Business on Hadoop, HBase, and Open Source Distributed Computing”, have been posted to here:
PDF: Here
Video should be in up a day or so– having internet problems at home :) If the video quality is too poor, I’ll upload the audio track.
The talk focuses on two things: how to approach scalability from a problem solving standpoint, and how Visible Technologies used Hadoop, HBase, and Lucene to build a web-scale Business Intelligence platform.
When thinking about scalability, ask yourself a series of questions:
- What makes me special? (What part of our problem are we trying to scale? How can we use it to our advantage?)
- What can I sacrifice? (Compared to a traditional RDBMS, what are you willing to get rid of? Some sacrifices make scalability much easier).
- How will my data be structured? (Not everything should be reduced to tables and rows. Do you have documents? Graphs? Videos?)
The slides and talk focus on how Visible answered those questions, and how they led us to our Hadoop stack architecture. In the future, I’ll be talking more about the process of thinking about scalability. Stay tuned!












Hi, I'm Bradford. I write about scalability and the fringes of Computer Science.
August 7th, 2009 at 9:17 am
Still no A/V ?
August 7th, 2009 at 10:32 am
Sorry — video didn’t turn out too well :) I’m going to make a voice-over later.
August 8th, 2009 at 12:13 pm
I got the file from slideshare and it was something strange and fragmented.
would you please provide a direct link to the PDF file.
August 8th, 2009 at 2:38 pm
Yes — I’ve put the PDF link in the post now :)
August 14th, 2009 at 7:14 am
Looks like a great presentation. I’m sorry I missed it. Any ETA on the voiceover?
August 14th, 2009 at 9:17 am
Thanks! I’m afraid I don’t have 6 hours to put together the voiceover anytime soon. But I *am* working on my Animated Guide To HBase :)
September 9th, 2009 at 11:56 am
[...] (and un-teaching) the habits of the Swiss Army RDBMS can be difficult. As mentioned in earlier articles, databases like Cassandra and HBase are not RDBMSs. You cannot just write SQL and expect everything [...]