Speaker: Professor Ming Yan(Michigan State University)
Venue: Room 200-9, Run Run Shaw Business Administration Building, Yuquan Campus
Abstract:
Large-scale machine learning models are trained by parallel stochastic gradient descent algorithms on distributed or decentralized systems. The communications for gradient aggregation and model synchronization become the major obstacles for efficient learning as the number of nodes and the model's dimension scale up. In this talk, I will introduce several ways to compress the transferred data and reduce the overall communication such that the obstacles can be immensely mitigated.