Skip to main content
Browse by:
GROUP

ECE Colloquium: Low-latency datacenter networks

Event Image
Tuesday, February 20, 2018
10:30 am - 11:30 am
Soudeh Ghorbani, researcher in computer networks
ECE Colloquium

Datacenters host a wide range of today's low-latency applications. To meet their strict latency requirements at scale, datacenter networks are designed as topologies that can provide a large number of parallel paths between each pair of hosts. The recent trend towards simple datacenter network fabric strips most network functionality, including load balancing among these paths, out of the network core and pushes it to the edge. This slows reaction to microbursts, the main culprit of packet loss -- and consequently performance degradation -- in datacenters. We investigate the opposite direction: could slightly smarter fabric significantly improve load balancing? I will present DRILL, a datacenter fabric which performs micro load balancing to distribute load as evenly as possible on microsecond timescales. DRILL employs per-packet decisions at each switch based on local queue occupancies and randomized algorithms to distribute load. I will explain how we address the resulting key challenges of packet reordering and topological asymmetry and present results showing that DRILL outperforms recent edge-based load balancers, particularly under heavy load while imposing only minimal (less than 1%) switch area overhead. Under 80% load, for example, it achieves 1.3-1.4x lower mean flow completion time than recent proposals. Finally, I will discuss our analysis of DRILL's stability and throughput-efficiency. I will conclude by discussing some of the challenges and opportunities