Practical Coreset Constructions for Machine Learning

Bachem, Olivier, Lucic, Mario, Krause, Andreas

arXiv.org Machine Learning 

Over the last years, the world has witnessed the emergence of data sets of an unprecedented size across different scientific disciplines. The large volume of such data sets presents new challenges as gathering, storing, and analyzing them becomes expensive. In the context of millions or even billions of data points, existing proven algorithms "suddenly" become computationally infeasible while data sets may not fit on single machines anymore but must be stored on clusters of machines. As a consequence, new algorithms are required to scale to this massive data setting. While one could focus on single machine learning problems and come up with endless new algorithms, we focus on a more general approach: we investigate coresets -- succinct, small summaries of large data sets -- so that solutions found on the summary are provably competitive with solution found on the full data set.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found