Sign up to receive the latest Insights posts in your inbox.

  • Data Science

Gradient Sparsification for Communication-Efficient Distributed Optimization

Modern large-scale ML applications require stochastic optimization algorithms to be implemented on distributed computational architectures. A key bottleneck is the communication overhead for exchanging information such as stochastic gradients among different workers. In this paper, to reduce the communication cost, we propose a convex optimization formulation to minimize the coding length of stochastic gradients.