Reviews: Snap ML: A Hierarchical Framework for Machine Learning

Neural Information Processing Systems 

The work at hand describes a new distributed implementation (Snap ML) for training generalized linear models for very large datasets. More precisely, the authors extend the popular CoCoA method [15] by providing a hierarchical version that is optimized for distributed computing environments. One of the key ingredients of this new version is the reduction of the overhead caused by the induced inter-node communication (see Section 2.1). In addition to a theoretical analysis of their new approach (Equation (2)), the authors also provide various implementation details including details related to an efficient local GPU solver per compute node (Section 3.1), to buffering techniques to reduce the overhead caused by memory transfers between host and device per compute node (Section 3.2), to an efficient exchange of information between compute nodes (Section 3.3), and to the overall software architecture. The experimental evaluation indicates some benefits of the new hierarchical scheme (Figure 5).