Parallelizing Instance-Based Data Classifiers
Rahal, Imad (College of Saint Benedict and Saint John's University) | Furst, Emily (University of Washington) | Haraty, Ramzi (Lebanese American University)
In the age of BigData, producing results quickly while operating over vast volumes of data has become a vital requirement for data mining and machine learning applications to a degree that traditional serial algorithms can no longer keep up with these constraints. This paper applies different forms of parallelization techniques to popular instance-based classifiers–namely, a special form of naive Bayes and k-nearest neighbors–in an attempt to compare performance and make broad conclusions applicable to instance-based classifiers. Overall, our experimental results strongly indicate that parallelism over test instances provides the most speedup in most cases compared to other forms of parallelism.
May-8-2016
- Technology: