Internet Identity Workshop #11

I sampled the program of the 11th Internet Identity Workshop (unconference) held at the Computer Museum in Mountain View (the 2nd this year, see my notes from IIW10, also in MTV).

OAuth 2.0:

  • The spec still needs work on the Security Considerations section before it can be finally approved. Contributors sought
  • Some early adopters have voiced issues around endpoints supporting both 1.0 and 2.0 profiles at once
  • Mike Jones has taken over part of the spec process (bearer token), which will be packaged as a different profile (and RFC)
  • JSON Web Token (JWT) defines a specific token format. The claims in a JWT are encoded as a JSON (digitally signed)
  • Are the lessons learned from SAML usage being properly leveraged?
  • What would it take for OAuth to be adopted in the enterprise (Kerberos being the obvious benchmark). What’s missing in OAuth to pass enterprise or DoD vetting (e.g., what’s the minimal entropy for the verification code?)

OpenID:

  • Several reports on user experience testing
  • PayPal described its experience as OpenID provider. They contribute high-quality identity datapoints like verified/certified shipping address. A client can override shipping address but doing so bears an impact to the risk rating
  • There’s now an OpenID retail advisory committee (RAC)
  • OpenID Connect (OpenID redux atop of OAuth 2.0) is a WIP to extend OpenID by bringing profile, data, etc. (like portable contacts and activity streams) along across sites

Microsoft’s U-prove certificates:

  • The intellectual properties stem from the credentica.com acquisition 2 years back
  • Protocols specification was published in March 2010 (RSA conference). There exists an open source SDK
  • It’s a new kind of certificate which permits thinning of the claims therein, while preserving the capability to crypto verify
  • Value props include: minimal disclosure, derived claims (e.g., from DOB to 21-or-older claims), unlinkable claims (like coins, unlike bills), negation claims (I’m not in that list)
  • Proponents anticipate an ecosystem that works for gov agencies (e.g., DMV), enterprises, consumer, devices

Personal data stores (PDS):

  • It’s the utopian place where I could manage all web data concerning yours truly, whether it’s stored by value or by reference
  • Example: my search results going back 1 or 3 years
  • Value props include: empower consumer to manage data value chain (or purposely delegate the same); centralize and enforce a permission regimen (e.g., mint nonce to access my PDS); find like consumers; data portability and exchange across multiple PDS; high-quality and quicker scoring

Email is not dead just yet:

  • Idea: use it as the pervasive, common denominator transport (SMTP) and repository (folders) for seamless federation of social networks
  • Key concepts demonstrated in the Mr. Privacy research effort by the MobiSocial team at Stanford
  • Webfinger resolves an email address into a set of machine-friendly service endpoints
  • Inbound email can result in an extensible set of action handlers (like calendaring or Xobni already leverage)
  • Potential use of OAuth for folder-level access

Comments (5)

Redress apps for Cloud – Netflix

Adrian Cockroft of Netflix (and a former eBay colleague) recently described his journey to run Netflix services off of a public cloud, effectively and efficiently.

Along with Alex Stamos’ security talk that I profiled in the previous same-title blog, Adrian’s talk is easily the best public account of Cloud enterprise “pathfinding” that I have come across in a long long while. From different angles, both talks reach the conclusion that it’s better to re-architect the whole thing rather than tinkering with it. Both talks bear no hype, frills, or inflated expectations.

Adrian goes on to list the “undifferentiated lifting” that is left for Netflix to do and should come instead off-the-shelf from the Cloud portfolio of services:

  • middle-tier load balancing
  • caching
  • encryption services (I’d imagine he means key management services in general)
  • distributed application management (a tough nut to crack, this one!)

which we will hopefully see soon in Clouds near us. Thank you for sharing, Adrian!!

Comments (9)

eBay’s Technical Voice

eBay has recently launched a tech blog to give voice to the many technical leaders that are hard at work to advance the world’s largest marketplace. Hugh Williams kicked it off with the first post on Site Speed for eBay Search Results.

While I’m at it, I single out four presentations that my colleagues recently gave at JavaONE 2010. They touch on some recent (or recent-1) interests of mine.

Login Failed, Try Again: 10 Best Practices for Authentication in the Cloud, Farhang Kassaei. Farhang does a really good job at delineating the functional roles of Secure Token Service (STS), Identity Providers (IdP), Relaying Party (RP), Guards, policy elements, etc. that enable eBay’s secure scale-out operations like Cloud. I’m number one fan of this architecture and actively championed it to make it a pillar of eBay Mobile architecture.

More Best Practices for Large-Scale Websites: Lessons from eBay, Randy Shoup. A small set of principles underpins some massive scale-out and extensibility stories. I’ve had the pleasure to co-keynote with Randy at LADIS08. That presentation had the first installment of Randy’s renowned best practices.

Concurrency Grab Bag: More Gotchas, Patterns, and Tips on Practical Concurrency, Sangjin Lee. As he did at Java ONE 2009, Sangjin continues to contribute nuances and new results to the Java Concurrency body of work (like Brian Goetz’s et al.)

7 Deadly Sins of Enterprise Java Programming and Deployment in the Multicore Era, Mahesh Somani co-presented with Intel. This presentation marries valuable lessons in concurrency with some handy tutorial material on Intel’s published roadmap (e.g., need to re-sync on Tick Tock, Nehalem vs. SandyBridge, 45 vs. 32 nm, etc.). I’m still looking for a public URL to this presentation.

Comments (11)

Redress your apps for Cloud

This week, Alex Stamos of iSEC Partners visited and gave a great talk titled “Securely Moving Your Business into the Cloud”.  Much of that material is publicly available here. Alex is a straight shooter and a straight talker.  By the second slide, he’s already warmed up and delivers quite a punch line:  You cannot securely move into the cloud without re-writing your software.

I subscribe to that line. And there’s more to it than security. Earlier on, I’ve reached the same conclusion when thinking about availability and all the *-abilities that an enterprise needs for its business-critical operations.

Every so often, the IT industry falls for the holy grail of horizontally scaling applications, blindly and effortlessly, without touching a line of code. It happened with Grid Computing before Clouds. The early wins in their respective stomping grounds (HPC for Grids, entrepreneurs for Cloud) don’t necessarily scale to become F500 wins. Rather, reality sinks in, that one needs to rework the application stack and, worse yet, needs to recruit several PhD types to do that. We cannot defy gravity nor the laws of distributed systems.

In learning this all over again, there’s some forward progress. Those who venture into retooling their stack will most likely achieve superior security and *-abilities in general. In their dollar and sense considerations, they will have to contrast Cloud savings with the budget and timeline to implement and operationalize the new stack. Some others will justifiably punt and wait for a Hail Mary pass* by whatever will come next after Grid and Cloud.

*Not quite Dave Patterson’s Hail Mary pass, even though there’s a striking similarity with what’s happening with multi-core at micron scale and the annex arguments pro/con application re-write.

Comments (8)

Cloud: Dark Side, Good Side

I found some good food for thoughts in Dave Durkee’s article “Why Cloud Computing will never be free” in the May issue of Communications of the ACM (did I ever say how much I enjoy reading CACM, lately?)

Competitive price pressures threaten to down spiral Cloud service quality. Those consumers who nickel and dime the commodity goods served by the Cloud (e.g., computing cores or gigabytes of storage) will get a taste of their own medicine, as Cloud providers nickel and dime them back just the same or more.  The supply-side “shop of horrors” that Durkee documents for Cloud is a scary one:

  • Murky pricing models or time commitments
  • Grossly oversubscribed resources
  • Silent traffic engineering downgrades (e.g., from 10Mbit/s down to 1Mbit/s)
  • Recycling of failed disk drives
  • Unattainable service uptimes (say  4 or 5 nines) with clement penalty clauses (say 10% discount) whose only purpose is to give the provider the bragging rights on those many nines
  • Undefined performance auditing benchmarks
  • SLA miss refunds that stretch out a customer’s time commitment (e.g., annualized refunds only)

These syndromes down-spiral the Cloud and set it further away from an enterprise’s needs.

Service level management (SLM) is the key to revert the direction and pull up the Cloud to enterprise-class service. Enterprises will want translate their business imperatives into service level objectives (SLOs), use them in the SLA negotiation with the Cloud provider, then monitor hits and misses. But SLM is no small feat. At either end of the service demarc line, there has to be complex logic to manage those SLOs and apply proper compensating actions upon exception. It’s no surprise that the crown jewels of companies like FedEx or Santa Fe Railway are the SLM logic that maximizes shareholder value out of the least amount of commodity resources (Fred Smith didn’t invent new airports nor airplanes).

These aspects didn’t go unnoticed to the Utility Computing camp (UC being the fad just before Cloud). They standardized a protocol, WS-Agreement, to manage a SLA throughout its lifecycle. May some of that experience be leveraged in the new world.

Sent from my iPad

Comments (4)

Sports that scale: Soccer

I’m a huge fan of football round kind. Every four years, I take the time to follow the FIFA World Cup and keep tags on nearly all the 32 teams that start off.

The FIFA tournament funnels large, geographically disperse audiences onto relatively few events (if compared to more spread-out calendars like the Olympics’). We are barely mid-way and am already seeing the World Cup matches making “dents” to our e-commerce traffic traces, starting with the national-level traces. Their W shape clearly marks the first half of a match (traffic is significantly depressed for 45 minutes), then the interval (traffic way, way up), and the second half (traffic down again for some 45 minutes more). Italy was the country showing the most pronounced dents among the ones that I surveyed (but no more of that, given their early toss out). The colleagues in the NOC must be aware of this happening and tease these symptoms apart from, say, a problem with backbone routers.

As the tournament progresses, those dents surface from a national-level to a super-national level (e.g., pan-European level). Eventually, they will make an appearance in the world-wide roll-up of all traces once semi-finals or finals take place. That will be the pulse of a planet. This year only, let’s call it the WWWzela effect :)

It’s interesting to debate whether these dents in the traces will be more or less pronounced compared to 4 years back. Several factors tip my expectations up or down:

(+) increasingly, Internet access is a commodity and overall traffic grows at a good clip year over year, worldwide;
(+) online content has grown manifold too, giving folks more reasons to be online before/after a match (e.g., sports commentaries, friends and family chats, etc.);
(+) the application bias is more significant, meaning that (say) social web features and e-commerce features will exhibit different levels of perturbation before/during/after match;
(-) compliments of wifi, smartphones, etc., more audiences are untethered and can now multi-task effectively during a match;
(-) DVRs, VOD, web video services are making tape-delay more practical than ever, thus eroding synchronization effects around any timed event.

Now, for another scaling dimension…

Some eleven basketball courts or so can be tiled over a soccer pitch. Yet, there is a single referee in a pro soccer match vs. three referees in a NBA match. Isn’t this a blatant scaling anomaly? Yes, it surely sounds like, though it’s basketball that got it wrong! As Ed Felten aptly puts it, the soccer rules are designed to scale way down and give any amateurs’ team the thrill of playing a match with precisely the same rules that the pros use. Nowhere is this more evident than in Brazil, where I can easily see legions of footballers of all ages and skillsets totally at ease with football’s minimalist prerequisites and ways to officiate a match.  There will always be blatant mistakes by referees (oops, I just saw one today morning). In absence of malice and conspiracy, they will even out, despite the immediate heartburns. That’s pretty good scaling to me.

Comments (10)

My other computer is a Utah datacenter

I was recently in Salt Lake City and visited the new datacenter that opened back in May.

Quick facts:

  • ~300M$ investment
  • 240,000 square foot building; inside, 3x rooms with 20,000 sqf each of rack-worthy raised floor
  • fault tolerant Tier IV data center
  • designed PUE of 1.4
  • 7.2 MW of total server load
  • 400V/230V power distribution (230V to servers)
  • Outside air used for cooling at least half the year (water-side economizer)
  • Total hot-aisle/cold-aisle containment
  • Deploy a rack anywhere/anytime, thanks to ToRs plus optional in-row adaptive cooling

Its opening was covered quite well in the press and blogosphere, see for instance 1, 2, 3, 4 for details, pics, color.

Some observations:

  • Internet-scale Maestro James Hamilton is right on when he says that there are significant economies of scale in building and operating an Internet-scale datacenter, albeit with a very high cost of entry;
  • How low can you go in the layers. I developed system, server, network aspects for this datacenter and thought of myself covering low layers of infrastructure. Move 800 miles East from the office, and these same aspects now look like the tip of an iceberg. That is, they are the topmost layer in a deep stack of power distribution layers, cooling layers, backups of backups before terminating at the power substation and the high-voltage power lines;
  • It’s hard to manage all dependencies, especially when the overall system of systems is mission-critical. Kudos to the hardware folks who are so much better at this than us software types;
  • Then there’s Cloud Computing… Given the cost of entry and the long-term commitment to an Internet-scale datacenter, it’s no surprise that Clouds are becoming increasingly competitive against traditional options (e.g., lease colo space or roll your own).

Comments off

StubHub goes Mobile

This week, we unveiled the latest installment in the collection of iPhone/iPad applications by eBay Inc.: the StubHub app. Kudos to my colleagues. They didn’t limit themselves to the traditional purchase flow, like: search event, select seats, buy tickets. Instead, they raised the bar higher in several ways:

  1. I scroll through the list of events near my current location
  2. Alternately, I can correlate the performers in my iPod list with upcoming local events
  3. I share event news and ticket availability on Facebook or Twitter
  4. I get a map to the nearest FedEx/Kinko offices in case I need to print tickets on my way to the event

Great mash-ups. What a terrific tricorder the iPhone turns into for all the night owls out there. Live long and prosper (and do take the time to enjoy those events).

Comments (7)

Identity Abuzz: Notes from IIW10

I spent two days at the Internet Identity Workshop 10. IIW events are set in an open space, unconference style. True to its workshop designation, it’s a place to do work collegially. It’s not a place to give scholarly papers or some polished slide gesticulation.

I list hereafter the topics that I engaged on at IIW10, in a similarly frugal style. They complete my sweep of the Identosphere that I had started here.

OAuth 2.0 – The authors clarified several points in the specification (is the refresh token entirely optional? yes) and kindly requested help to turn the I-D into a RFC that can pass muster with the IETF security directorate (esp. for the security considerations section);

UMA – User Managed Access provides a method for a user to control access to her resources, wherever they might be. For this, UMA defines an authorization manager. The authorization manager reacts to requests by online services acting on a user’s behalf and makes access decisions based on user policy. My colleague and identity extraordinaire Eve Maler is a leading force behind this effort. UMA is set to leverage OAuth 2.0 and various card, token technologies. I saw the demo of a UMA system built by the SMART team at Newcastle University;

Personal Data Stores (PDS) and an internetwork of PDS (PDX for Personal Data eXchange) using XDI-like protocols;

OpenID Connect –  It combines OpenID federated login with OAuth 2.0 access authorization;

PingPong IdP Discovery 1.0 – We all advocate the freedom to register with one or more Identity Providers (IdP) among many available. As such, we need a protocol to assist in the IdP discovery and thus determine which IdP(s) can authenticate a given user;

Mozilla’s account manager –  This work exemplifies identity in the browser. Unlike password managers, it includes ways for a site to advertise to the browser multiple styles of identity artifact (e.g. Openid, InfoCards, or plain old passwords) and current state (signed in or not);

A meta-point: These identity systems are distributed systems and, not surprisingly, pose the same challenges as any other distributed system: get the naming rules right, identify and manage all dependencies, spell out consistency requirements and the companion failure semantics, etc.

Comments (7)

Living scale

Today is a white stone day for microbiologists, science, and all of us. Craig Venter and team have successfully created a new species “whose parent is the computer” (in Venter’s words). Their fabricated cells are capable of continuous self-replication and have already replicated several billion times. It is quite a new benchmark for a man-made scale out. This breakthrough ushers us in a new era much like the invention of steam engines and silicon chips did.

Around 2005 0r 2006, I met some microbiologists at a Grid Computing meeting. In a chat over dinner, they told us that in five years or so we would be hearing of some folks playing jr. God in a lab. Were they right!

Like the Manhattan project scientists found out at their time, with power come responsibilities. Today’s breakthrough is due to stir up some strong debate around bioethics.

NOTE: This week’s Economist issue has a great op-ed, a briefing article, and a cool cover too.

Comments (4)