Approaching Scalability, and Building a Biz on Hadoop, HBase, and Open Source Distributed Computing

The slides for my OSCON 2009 presentation, “Approaching Scalability: Building a Business on Hadoop, HBase, and Open Source Distributed Computing”, have been posted to here:

http://www.slideshare.net/lusciouspear/building-a-business-on-hadoop-hbase-and-open-source-distributed-computing

PDF: Here

Video should be in up a day or so– having internet problems at home :) If the video quality is too poor, I’ll upload the audio track.

The talk focuses on two things: how to approach scalability from a problem solving standpoint, and how Visible Technologies used Hadoop, HBase, and Lucene to build a web-scale Business Intelligence platform.

When thinking about scalability, ask yourself a series of questions:

  1. What makes me special? (What part of our problem are we trying to scale? How can we use it to our advantage?)
  2. What can I sacrifice? (Compared to a traditional RDBMS, what are you willing to get rid of? Some sacrifices make scalability much easier).
  3. How will my data be structured? (Not everything should be reduced to tables and rows. Do you have documents? Graphs? Videos?)

The slides and talk focus on how Visible answered those questions, and how they led us to our Hadoop stack architecture. In the future, I’ll be talking more about the process of thinking about scalability. Stay tuned!

Share and Enjoy:
  • Reddit
  • Digg
  • Google Bookmarks
  • Technorati
  • del.icio.us
  • Facebook
  • Twitter
  • StumbleUpon
  • E-mail this story to a friend!
  • RSS
  • HackerNews
  • Slashdot

7 Responses to “Approaching Scalability, and Building a Biz on Hadoop, HBase, and Open Source Distributed Computing”

Leave a Reply