Hasso-Plattner-Institut
Prof. Dr. Tilmann Rabl
 

Marco Canini

Affiliation: KAUST
Title: Programmable Networks for Distributed Deep Learning: Advances and Perspectives

 

Abstract

Training large deep learning models is challenging due to high communication overheads that distributed training entails. Embracing the recent technological development of programmable network devices, this talk describes our efforts to rein in distributed deep learning's communication bottlenecks and offers an agenda for future work in this area. We demonstrate that an in-network aggregation primitive can accelerate distributed DL workloads, and can be implemented using modern programmable switch hardware. We discuss various designs for streaming aggregation and in-network data processing that lower memory requirements and exploit sparsity to maximize effective bandwidth use. We also touch on gradient compression methods, which contribute to lower communication volume and adapt to dynamic network conditions. Lastly, we consider how the rise of programmable NICs may have a role in this space.

Short CV

Marco does not know what the next big thing will be. He asked ChatGPT, though the answer was underwhelming. But he's sure that our future next-gen computing and networking infrastructure must be a viable platform for it. Marco's research spans a number of areas in computer systems, including distributed systems, large-scale/cloud computing and computer networking with emphasis on programmable networks. His current focus is on designing better systems support for AI/ML and providing practical implementations deployable in the real world. Marco is an Associate Professor of Computer Science at KAUST. Marco obtained his Ph.D. in computer science and engineering from the University of Genoa in 2009 after spending the last year as a visiting student at the University of Cambridge. He was a postdoctoral researcher at EPFL and a senior research scientist at Deutsche Telekom Innovation Labs & TU Berlin. Before joining KAUST, he was an assistant professor at UCLouvain. He also held positions at Intel, Microsoft and Google.