Goto

Collaborating Authors

 parallelizing instance-based data classifier


Parallelizing Instance-Based Data Classifiers

AAAI Conferences

In the age of BigData, producing results quickly while operating over vast volumes of data has become a vital requirement for data mining and machine learning applications to a degree that traditional serial algorithms can no longer keep up with these constraints. This paper applies different forms of parallelization techniques to popular instance-based classifiers–namely, a special form of naive Bayes and k-nearest neighbors–in an attempt to compare performance and make broad conclusions applicable to instance-based classifiers. Overall, our experimental results strongly indicate that parallelism over test instances provides the most speedup in most cases compared to other forms of parallelism.