Archive for September, 2008

Video interview on Clouds

In June, Dejan Milojicic of HP Labs hosted a fireside chat on Cloud Computing with Russ Daniels (VP and CTO of Cloud Services Strategy at HP) and me. The interview is part of an ongoing series of talks sponsored by the the IEEE Computer Society. The three of us engaged in a lively conversation (despite being bright and early in the morning!).

Dejan has just notified us that the video interview has been web-published here.

Leave a Comment

Large-Scale Distributed Systems and Middleware (LADiS)

When Ken Birman and his extended research group take a leading role in organizing a workshop, you can rest assured that it’s going to be a top-notch workshop. In the early 90s, I had the fortune to come across Ken Birman, Robbert van Renesse, Werner Vogels, and their group at Cornell working on virtual synchrony, Isis, U-net, Horus, etc. … I drew upon their work when I was at the OSF RI developing real-time distributed Mach OS … and managed to keep an eye on their work ever since. It was great to come down to LADiS and mix with that research crowd again. Sadly, just when I went down memory lane with this group, I happened to learn that Jay Lepreau — another leading light to me and a good, passionate mentor — had passed away the night before.

I had the fortune to travel to LADiS with an esteemed colleague of mine, Randy Shoup. We co-authored and co-delivered this presentation on eBay’s scale-out journey. Judging from the questions and comments during and after our presentation, I would say that the presentation was well received. At LADiS, I enjoyed meeting James Hamilton of MSR. James’ talk and ours resonated on a number of topics related to internet-scale datacenters and their “this is life in a big city” nuances … whenever we went down different avenues, we seemingly complemented one another. Sure thing, I will be reading his blog from now on.

From the LADiS technical program, I single out the sessions on data collection/dissemination and resource management as the most relevant to my work. I will dig into many of these papers as soon as the proceedings are out. I’m still somewhat cold to Byzantine Fault Tolerance (BFT). I appreciate the intellectual challenge of arbitrary faults. However, I like to think that the application specific context and coding defensive practices (e.g., skeptics) go a long way towards addressing these faults without BFT replication. For what it’s worth, I cannot see myself producing a compelling TCO case for any of the BFT replication approaches that I have heard about. Specifically, the TCO would need to reflect the expanded operationalization complexity. OTOH, I’m not working in air traffic control environment either…

NOTE: I’ve accepted to work on a paper that summarizes the key themes and points heard at LADiS.

Comments (1)