Redress your apps for Cloud

This week, Alex Stamos of iSEC Partners visited and gave a great talk titled “Securely Moving Your Business into the Cloud”.  Much of that material is publicly available here. Alex is a straight shooter and a straight talker.  By the second slide, he’s already warmed up and delivers quite a punch line:  You cannot securely move into the cloud without re-writing your software.

I subscribe to that line. And there’s more to it than security. Earlier on, I’ve reached the same conclusion when thinking about availability and all the *-abilities that an enterprise needs for its business-critical operations.

Every so often, the IT industry falls for the holy grail of horizontally scaling applications, blindly and effortlessly, without touching a line of code. It happened with Grid Computing before Clouds. The early wins in their respective stomping grounds (HPC for Grids, entrepreneurs for Cloud) don’t necessarily scale to become F500 wins. Rather, reality sinks in, that one needs to rework the application stack and, worse yet, needs to recruit several PhD types to do that. We cannot defy gravity nor the laws of distributed systems.

In learning this all over again, there’s some forward progress. Those who venture into retooling their stack will most likely achieve superior security and *-abilities in general. In their dollar and sense considerations, they will have to contrast Cloud savings with the budget and timeline to implement and operationalize the new stack. Some others will justifiably punt and wait for a Hail Mary pass* by whatever will come next after Grid and Cloud.

*Not quite Dave Patterson’s Hail Mary pass, even though there’s a striking similarity with what’s happening with multi-core at micron scale and the annex arguments pro/con application re-write.

Leave a Comment

Cloud: Dark Side, Good Side

I found some good food for thoughts in Dave Durkee’s article “Why Cloud Computing will never be free” in the May issue of Communications of the ACM (did I ever say how much I enjoy reading CACM, lately?)

Competitive price pressures threaten to down spiral Cloud service quality. Those consumers who nickel and dime the commodity goods served by the Cloud (e.g., computing cores or gigabytes of storage) will get a taste of their own medicine, as Cloud providers nickel and dime them back just the same or more.  The supply-side “shop of horrors” that Durkee documents for Cloud is a scary one:

  • Murky pricing models or time commitments
  • Grossly oversubscribed resources
  • Silent traffic engineering downgrades (e.g., from 10Mbit/s down to 1Mbit/s)
  • Recycling of failed disk drives
  • Unattainable service uptimes (say  4 or 5 nines) with clement penalty clauses (say 10% discount) whose only purpose is to give the provider the bragging rights on those many nines
  • Undefined performance auditing benchmarks
  • SLA miss refunds that stretch out a customer’s time commitment (e.g., annualized refunds only)

These syndromes down-spiral the Cloud and set it further away from an enterprise’s needs.

Service level management (SLM) is the key to revert the direction and pull up the Cloud to enterprise-class service. Enterprises will want translate their business imperatives into service level objectives (SLOs), use them in the SLA negotiation with the Cloud provider, then monitor hits and misses. But SLM is no small feat. At either end of the service demarc line, there has to be complex logic to manage those SLOs and apply proper compensating actions upon exception. It’s no surprise that the crown jewels of companies like FedEx or Santa Fe Railway are the SLM logic that maximizes shareholder value out of the least amount of commodity resources (Fred Smith didn’t invent new airports nor airplanes).

These aspects didn’t go unnoticed to the Utility Computing camp (UC being the fad just before Cloud). They standardized a protocol, WS-Agreement, to manage a SLA throughout its lifecycle. May some of that experience be leveraged in the new world.

Sent from my iPad

Comments (1)

Sports that scale: Soccer

I’m a huge fan of football round kind. Every four years, I take the time to follow the FIFA World Cup and keep tags on nearly all the 32 teams that start off.

The FIFA tournament funnels large, geographically disperse audiences onto relatively few events (if compared to more spread-out calendars like the Olympics’). We are barely mid-way and am already seeing the World Cup matches making “dents” to our e-commerce traffic traces, starting with the national-level traces. Their W shape clearly marks the first half of a match (traffic is significantly depressed for 45 minutes), then the interval (traffic way, way up), and the second half (traffic down again for some 45 minutes more). Italy was the country showing the most pronounced dents among the ones that I surveyed (but no more of that, given their early toss out). The colleagues in the NOC must be aware of this happening and tease these symptoms apart from, say, a problem with backbone routers.

As the tournament progresses, those dents surface from a national-level to a super-national level (e.g., pan-European level). Eventually, they will make an appearance in the world-wide roll-up of all traces once semi-finals or finals take place. That will be the pulse of a planet. This year only, let’s call it the WWWzela effect :)

It’s interesting to debate whether these dents in the traces will be more or less pronounced compared to 4 years back. Several factors tip my expectations up or down:

(+) increasingly, Internet access is a commodity and overall traffic grows at a good clip year over year, worldwide;
(+) online content has grown manifold too, giving folks more reasons to be online before/after a match (e.g., sports commentaries, friends and family chats, etc.);
(+) the application bias is more significant, meaning that (say) social web features and e-commerce features will exhibit different levels of perturbation before/during/after match;
(-) compliments of wifi, smartphones, etc., more audiences are untethered and can now multi-task effectively during a match;
(-) DVRs, VOD, web video services are making tape-delay more practical than ever, thus eroding synchronization effects around any timed event.

Now, for another scaling dimension…

Some eleven basketball courts or so can be tiled over a soccer pitch. Yet, there is a single referee in a pro soccer match vs. three referees in a NBA match. Isn’t this a blatant scaling anomaly? Yes, it surely sounds like, though it’s basketball that got it wrong! As Ed Felten aptly puts it, the soccer rules are designed to scale way down and give any amateurs’ team the thrill of playing a match with precisely the same rules that the pros use. Nowhere is this more evident than in Brazil, where I can easily see legions of footballers of all ages and skillsets totally at ease with football’s minimalist prerequisites and ways to officiate a match.  There will always be blatant mistakes by referees (oops, I just saw one today morning). In absence of malice and conspiracy, they will even out, despite the immediate heartburns. That’s pretty good scaling to me.

Leave a Comment

My other computer is a Utah datacenter

I was recently in Salt Lake City and visited the new datacenter that opened back in May.

Quick facts:

  • ~300M$ investment
  • 240,000 square foot building; inside, 3x rooms with 20,000 sqf each of rack-worthy raised floor
  • fault tolerant Tier IV data center
  • designed PUE of 1.4
  • 7.2 MW of total server load
  • 400V/230V power distribution (230V to servers)
  • Outside air used for cooling at least half the year (water-side economizer)
  • Total hot-aisle/cold-aisle containment
  • Deploy a rack anywhere/anytime, thanks to ToRs plus optional in-row adaptive cooling

Its opening was covered quite well in the press and blogosphere, see for instance 1, 2, 3, 4 for details, pics, color.

Some observations:

  • Internet-scale Maestro James Hamilton is right on when he says that there are significant economies of scale in building and operating an Internet-scale datacenter, albeit with a very high cost of entry;
  • How low can you go in the layers. I developed system, server, network aspects for this datacenter and thought of myself covering low layers of infrastructure. Move 800 miles East from the office, and these same aspects now look like the tip of an iceberg. That is, they are the topmost layer in a deep stack of power distribution layers, cooling layers, backups of backups before terminating at the power substation and the high-voltage power lines;
  • It’s hard to manage all dependencies, especially when the overall system of systems is mission-critical. Kudos to the hardware folks who are so much better at this than us software types;
  • Then there’s Cloud Computing… Given the cost of entry and the long-term commitment to an Internet-scale datacenter, it’s no surprise that Clouds are becoming increasingly competitive against traditional options (e.g., lease colo space or roll your own).

Comments off

StubHub goes Mobile

This week, we unveiled the latest installment in the collection of iPhone/iPad applications by eBay Inc.: the StubHub app. Kudos to my colleagues. They didn’t limit themselves to the traditional purchase flow, like: search event, select seats, buy tickets. Instead, they raised the bar higher in several ways:

  1. I scroll through the list of events near my current location
  2. Alternately, I can correlate the performers in my iPod list with upcoming local events
  3. I share event news and ticket availability on Facebook or Twitter
  4. I get a map to the nearest FedEx/Kinko offices in case I need to print tickets on my way to the event

Great mash-ups. What a terrific tricorder the iPhone turns into for all the night owls out there. Live long and prosper (and do take the time to enjoy those events).

Leave a Comment

Identity Abuzz: Notes from IIW10

I spent two days at the Internet Identity Workshop 10. IIW events are set in an open space, unconference style. True to its workshop designation, it’s a place to do work collegially. It’s not a place to give scholarly papers or some polished slide gesticulation.

I list hereafter the topics that I engaged on at IIW10, in a similarly frugal style. They complete my sweep of the Identosphere that I had started here.

OAuth 2.0 – The authors clarified several points in the specification (is the refresh token entirely optional? yes) and kindly requested help to turn the I-D into a RFC that can pass muster with the IETF security directorate (esp. for the security considerations section);

UMA – User Managed Access provides a method for a user to control access to her resources, wherever they might be. For this, UMA defines an authorization manager. The authorization manager reacts to requests by online services acting on a user’s behalf and makes access decisions based on user policy. My colleague and identity extraordinaire Eve Maler is a leading force behind this effort. UMA is set to leverage OAuth 2.0 and various card, token technologies. I saw the demo of a UMA system built by the SMART team at Newcastle University;

Personal Data Stores (PDS) and an internetwork of PDS (PDX for Personal Data eXchange) using XDI-like protocols;

OpenID Connect –  It combines OpenID federated login with OAuth 2.0 access authorization;

PingPong IdP Discovery 1.0 – We all advocate the freedom to register with one or more Identity Providers (IdP) among many available. As such, we need a protocol to assist in the IdP discovery and thus determine which IdP(s) can authenticate a given user;

Mozilla’s account manager –  This work exemplifies identity in the browser. Unlike password managers, it includes ways for a site to advertise to the browser multiple styles of identity artifact (e.g. Openid, InfoCards, or plain old passwords) and current state (signed in or not);

A meta-point: These identity systems are distributed systems and, not surprisingly, pose the same challenges as any other distributed system: get the naming rules right, identify and manage all dependencies, spell out consistency requirements and the companion failure semantics, etc.

Leave a Comment

Living scale

Today is a white stone day for microbiologists, science, and all of us. Craig Venter and team have successfully created a new species “whose parent is the computer” (in Venter’s words). Their fabricated cells are capable of continuous self-replication and have already replicated several billion times. It is quite a new benchmark for a man-made scale out. This breakthrough ushers us in a new era much like the invention of steam engines and silicon chips did.

Around 2005 0r 2006, I met some microbiologists at a Grid Computing meeting. In a chat over dinner, they told us that in five years or so we would be hearing of some folks playing jr. God in a lab. Were they right!

Like the Manhattan project scientists found out at their time, with power come responsibilities. Today’s breakthrough is due to stir up some strong debate around bioethics.

NOTE: This week’s Economist issue has a great op-ed, a briefing article, and a cool cover too.

Leave a Comment

Identity Abuzz: OAuth

The community that concerns with Identity in the Web has had a very hectic month of April. Identity is the bedrock foundation of anything social – think 3rd-party value-add services rooted on the social graph that any one of the Twitter, Facebook, Linkedin, etc. expose and promote access to. Among various events, I single out Facebook’s F8 event as the catalyst for several announcements and specs that came out this month.

The emerging OAuth protocol is one of the most interesting sights in the Identosphere. OAuth enables 3rd party access to web resources without propagating or sharing passwords. It has been likened to a valet key, in that resource owners can delegate access along with an envelope of authorized actions.

I have been interested in OAuth for quite some time because it holds potential:

  • to stop to the password sprawl and make it less likely that passwords will be mishanded, either in users’ hands or in the back-end of some poorly managed IT or Clouds (as I observed here in the case of smartphones)
  • to curb phishing vectors by way of branded sign-in pages that the user is redirected to in a seamless user experience
  • to bring devices that are data-entry impaired (like my beloved Roku box) back into the fold of dependable authentication

The OAuth chronology goes like this:

  • Dec ‘07, OAuth 1.0 debuts
  • Vulns documented
  • June ‘09, OAuth 1.0a is introduced addressing vulns
  • Shortly afterwards, OAuth 1.0a implementations become available, chiefly Twitter’s
  • OAuth 1.0a is demonstrated on the iPhone platform, with applications like Flickit
  • May 2009, IETF OAuth Working Group is chartered in the IETF
  • November 2009, folks from Microsoft, Google and Yahoo introduce the OAuth Web Resource Authorization Protocol (WRAP) and contribute it to the IETF.  Chiefly, It standardizes on the creation and propagation of tokens over SSL (in lieu of signatures). Also, it codifies a number of use cases and roles. By far, I found this to be the best-written spec in the whole OAuth document series
  • April 2010, OAuth 1.1 becomes RFC 5849
  • April 2010, OAuth WRAP implementations are announced
  • April 2010, the first revision of the Oauth 2.0 Internet Draft is released; it builds upon both OAuth 1.0a and OAuth WRAP

I’m eager to see how OAuth will do vis a vis with these challenges:

  • Which impact: Will the OAuth protocol be universally implemented to the letter of the emerging IETF standard? Or will there be dialects, each producing an island of interoperability around a specific social graph like Twitter’s, Facebook’s,  Linkedin’s, etc.
  • Set proper expectations: OAuth will not rid us of phishing. There will still be rogue clients and exploits of the client callback URL. However, the risks will provably be contained to loosing the token in lieu of the password (the former being lower-grade security material than the latter)
  • Stand cross currents: XAuth (also announced in April!) and browser-specific solutions like Mozilla’s Account Manager pitch radically different solution points to the web identity challenge

I look forward to being at the Internet Identity unConference, May 17-19th, in Mtn View.

Comments (1)

Cloud pulls crypto agendas

What a great monthly publication CACM is. In the 15 years that I’ve been a member of the ACM, this must be the time that I’m getting the most out of CACM (now in soft-copy as well for extra convenience). In recent issues, CACM has featured interesting crypto papers with a Cloud spin.

In the March issue, I dug into Craig Gentry’s paper on homomorphic encryption. In today’s Clouds, we cannot separate delegation of processing from delegation of cleartext access. Enter homomorphic crypto and, voila, we no longer need to question a Cloud provider’s aptitude to handle sensitive information. With this crypto, one can tap off-the-shelf public compute resources to do the Navier-Stokes for a new wing or process the interception tracks from some military sightings, yet without ever revealing a thing. In practice, however, I doubt that there are that many Cloud use cases begging for homomorphic crypto … once I take away those that belong in private Clouds anyhow (e.g., for SLA reasons) and those that can be simply dealt with via anonymization (e.g., for medical records), tokenization (e.g., for select PII elements), and simple tests for equality (for which standard crypto suffices). Regardless, this is one of those jaw-dropping results well worthy of a you-must-be-kidding-me reaction. I give Gentry plenty kudos for making his material highly accessible and engaging. In the pile of security papers that I have read over the years, Alice has never looked so good and crafty!

In the April Issue, I’m reading Sergey Yekhanin’s article on crypto protocols that protect the privacy of queries to public databases. It’s not an identity challenge. Rather, it’s about disguising the intention of a query or a set of queries. In the age of real-time analytics, it’s not far fetched that a database provider or a data aggregator in the Cloud manages to detect and then leverage mounting interest in a particular topic. Counter to that, the discipline of private information retrieval makes it hard or impossible to infer a subject’s intention at the expense of some communication and/or data overhead.

In both cases, I’m eager to see how these research results will be reduced to practice. The Cloud can dress up as transformational technology capable to pull through some powerful ideas.

Comments (2)

10 Issues with smartphone apps

Someone best characterized application vs. platform in just a dozen words, as follows: A good application never surprises, a good platform never stops to surprise (I’d love to give proper credits, if someone is kind enough to provide me the citation).

I continue to be quite impressed with the two smartphone platforms that I dug into, iPhone and Android. They never stop to surprise me on the positive side with their nuggets of enabling technology.

I do have quite a few issues with their applications and the way they are written. Alas, they surprise me when and where they really shouldn’t. Here’s a list of 10 top of mind issues in no particular order:

  1. Unexpected entitlements. Some applications are more equal than others. For instance, try signing-out from your primary gmail account on Android. It won’t work unless the whole device is wiped clean;
  2. Power efficiency. Some applications turn the radio on very often and can even be quite chatty whenever they do so. In absence of a “green rating” for applications, it’s a trial and error process of loading some applications and then discovering that battery autonomy has suddenly tanked compliments of a “fat” application in that mix;
  3. Applications work unless they don’t. It’s hard to know why an application suddenly gets into the habit of aborting launch. It silently goes back to being a cute square icon, ready to fail again just the same;
  4. Stale coding practices. The application development environments don’t leverage any of the new ideas in software engineering, like Ruby on Rails with its built-in unit/functional testing;
  5. Bloomingdale’s and the bazaar. Paraphrasing E. Raymond, there seem to be just two styles of application store emerging: the exclusive velvety one (iTunes, Ovi) and the open messy one (Android). It would be nice to see some hybrid concepts emerging. It will be a pity if the smartphone software channels are already fully ossified this early in the game;
  6. Password sprawl. Without a widespread identity infrastructure, I’m forced to set passwords in as many different applications and have their renewal/challenges hanging on me. Intriguingly, the latter too change in frequency and style with the application, thus making it a really fragmented experience and a race towards lower grade security policies (i.e., simple passwords with the longest expiration intervals possible);
  7. Back-end password handling. Without a widespread identity infrastructure, chances are that for a given application the database of subject’s secrets and the subject’s application data get collocated into the same Cloud and the same logical slice therein. This is what my colleague Gunnar Peterson colorfully describes as loading dynamite and detonator onto the same truck;
  8. Porous sandboxes. The sandbox that an application operates in has several back-alley read/write access pathways to free-for-all data (e.g., the keyboard cache and address book on the iPhone, as described here), thus creating opportunities for Trojans and covert channels;
  9. Panta rei. After I stumble upon a really clever application and make it part of my daily life, it’s quite likely that another vendor will pick on the same good idea and apply some healthy one-upmanship to improve it. Thus, I regularly have the dilemma, whether to stick to the data accrued thus far or start fresh on a brand new application, without any migration capability in sight;
  10. Cloakers and phishers. Some applications mean big business and naturally attract ill-intentioned copycats. There are just so many pixels to copy. Current defenses are mainly non-technical – e.g., the presence in the iTune store hinges on relationships between vendor, Apple, and the user community. They are not as effective in the bazaar style of application store.

I don’t believe in the rise of mobile multi-platform application frameworks (other than WebKit, that is), nor do I believe in unicorns.

However, I’m firmly convinced that smartphones will pull through advances in software – be it on gadget, on cloud, or identity infrastructure  – much as they have already done for the 3G telco infrastructure.

Comments (4)