Liquid Democracy for Low-Cost Ensemble Pruning

Armstrong, Ben, Larson, Kate

arXiv.org Artificial Intelligence 

In the past several years, the training of machine learning systems has consumed increasingly large amounts of data and compute. In the search for ever-improving performance, models have grown larger, more data has been collected, and the cost of machine learning has grown while performance only improves incrementally [16]. This leads to negative repercussions affecting privacy by incentivizing mass data collection, increased development time due to the time taken to train models, and significant environmental costs. It also limits access to the best-performing models to those groups with enough resources to support storing massive amounts of data and training large models. Recent advances have begun to consider learning from few examples for settings where data is hard to generate or resources are limited [21] however this field is still in its early stages. We propose adapting an existing paradigm of opinion aggregation to address the problem of compute requirements during classifier ensemble training. Ensemble learning for classification has long studied the problem of combining class predictions from groups of classifiers into a single output prediction. Condorcet's Jury Theorem, a well-known result from social choice theory (predating ML research by 2 centuries), states that if voters attempt to guess the correct outcome of some ground-truth decision then the majority vote is increasingly likely to be correct as voters are added if all voters are independent and have accuracy above 0.5