How To Make Life Suck Less (While Making Scalable Systems)
As I write this, I’m sitting at my home office with a five-year-old desktop and its failing monitor. Surrounding me are protein bar wrappers, fingerprint-covered water glasses, and guitar amp vacuum tubes. It’s 7 in the morning.
If you don’t know me you might think I’m an early riser. Alas… I’ve actually been up all night, unable to sleep, babysitting my Hadoop cluster and worrying about getting this new platform into production.
The past three months have been hellish — some amazing accomplishments, but plenty of painful mistakes. “You guys are working hard but it seems so effortless,” I heard from another team lead. If he only knew. Developing a distributed, concurrent platform for web-scale data is far from effortless. A great deal of the time we spent ‘working hard’ was not new coding at all, but relentlessly beating on problems. I don’t always take my own advice, but I’m getting better. And so, I’ll try to keep the following in mind when planning what to do next:
Developing a platform for web-scale data analysis is hard work. Unlike well-worn areas like MVC on LAMP, big data (billions of records, intensive analysis) presents brand new challenges that need innovative engineering, testing strategies, and planning.
In recent months I’ve led my team at Visible Technologies through creating a new analytics platform based on Hadoop. This article lists a few lessons from our ordeal, which hopefully should help if you venture into similar territory. Here are my observations:
1. Big data is BIG
2. This is systems software, not an application (for now)
3. Learn the source, engage the community, contribute feedback
4. Scalable doesn’t imply cheap or easy. Just cheaper and easier.
5. It’s so much easier with smart, experienced people.
Let’s examine these in detail…
Continue reading
Hi, I'm Bradford. I write about scalability and the fringes of Computer Science.