Goto

Collaborating Authors

 seqphragmen 0


Data as voters: instance selection using approval-based multi-winner voting

Sánchez-Fernández, Luis, Fisteus, Jesús A., López-Zaragoza, Rafael

arXiv.org Artificial Intelligence

Instance selection (or prototype selection) [García et al.(2015)] is a preprocessing task in machine learning (or data mining) that aims at selecting a subset of the data instances composing the training set that a machine learning algorithm will use. There are two main reasons to perform this task: efficiency and cleaning. Reducing the size of the training set reduces the computational cost of running the machine learning algorithm, especially in the case of instance-based classifiers like KNN (see the Preliminaries section for a description of KNN classifiers). Furthermore, we may be interested in removing noisy instances from the training set: instances due to errors or other causes can induce mistakes in the machine learning algorithm.