Goto

Collaborating Authors

 sparse matrix representation


Sparse Matrix Representation in Python - KDnuggets

#artificialintelligence

Most machine learning practitioners are accustomed to adopting a matrix representation of their datasets prior to feeding the data into a machine learning algorithm. Matrices are an ideal form for this, usually with rows representing dataset instances and columns representing features. A sparse matrix is a matrix in which most elements are zeroes. This is in contrast to a dense matrix, the differentiating characteristic of which you can likely figure out at this point without any help. Often our data is dense, with feature columns filled up for every instance we have.


When Dense Matrix Representations Beat Sparse

#artificialintelligence

In our world filled with unintended consequences, it turns out that saving memory space to help deal with GPU limitations, knowing it introduces performance penalties on matrix operations, can end up costing both performance and memory space. As reported in a paper at ISC19, researchers[i] recently rethought use of sparse matrix representations, originally motivated by GPU memory constraints, to use dense matrices in order to benefit from the larger memory capacities and scale-out capabilities of CPUs. The result was not only superior performance and scaling using CPUs, it also (perhaps surprisingly) included a reduction in memory footprint because of the interplay between using sparse representations to reduce memory and the increased memory usage due to algorithm inefficiencies. The researchers demonstrated the positive effects of their work in Horovod – an open source distributed Deep Learning framework for TensorFlow created by Uber Engineering. They also demonstrated its outstanding ability to scale-out, proving it using supercomputers run with large numbers of CPUs.