I read a couple of highly intriguing research papers on next generation datacenter switching, Monsoon and SEATTLE. They develop a Cloud-friendly view wherein a large-scale datacenter features:
- Flat plug-and-play addressing eliminating any server fragmentation
- High bisection bandwidth
- VM enablement
- Cost efficiencies at large scale
by way of:
- Huge and STP-free L2 domain with up ~10^5 servers in it
- IP presence limited to connecting the datacenter to the Internet
- Custom control plane and/or custom DHTs
I buy the spirit of these new-wave requirements, albeit with some caveats.
I will work hard to drive requirements strictly top-down from my applications and their own modus operandi down to the network, before I sign on a blank check for a anyone-to-anyone dynamic network environment. As a case in point, let’s assume that I have a design pattern by which my applications are either stateless or have their state fully externalized (in fact, it’s one of the design principles at eBay). From this, I derive that I will not live migrate virtualized application instances and will use simple create/destroy semantics instead. If I don’t have to worry about live migration, my network closet and my associated processes will begin to look a whole lot simpler. [This is quite something to admit for one who set a live migration benchmark back in 2005!]
In a Cloud provider scenario, the top-end does look open to any application style and its opposite. Is it really so and do we need to be all inclusive? I believe that we can still handily contain the requirements posed to the network by thinking in terms of abstracted tiers (each tier is what is horizontally-scaled to the customer, independently). Furthermore, as we look up the chain, the various PaaS stipulations provide a host of cues in terms of partitionable, directional, tiered workloads.
Lastly, for these ideas to be operationalized at scale, the new control plane(s) will need to earn quite some trust, just like any other foundational piece. After all these years, we are still very scared of STP flaps and their turning into a SPOF for the datacenter.
I enjoyed reading these papers and am grateful for their out-of-the-box, stimulating thoughts.
Is there a rose without thorns, an Ethernet without STP?