Distribution-Aware Online Classifiers
Nguyen, Tam T. (Nanyang Technological University) | Chang, Kuiyu (Nanyang Technological University) | Hui, Cheung Siu (Nanyang Technological University)
We propose a family of Passive-Aggressive Mahalanobis (PAM) algorithms, which are incremental (online) binary classifiers that consider the distribution of data. PAM is in fact a generalization of the Passive-Aggressive (PA) algorithms to handle data distributions that can be represented by a covariance matrix. The update equations for PAM are derived and theoretical error loss bounds computed. We benchmarked PAM against the original PA-I, PA-II, and Confidence Weighted (CW) learning. Although PAM somewhat resembles CW in its update equations, PA minimizes differences in the weights while CW minimizes differences in weight distributions. Results on 8 classification datasets, which include a real-life micro-blog sentiment classification task, show that PAM consistently outperformed its competitors, most notably CW. This shows that a simple approach like PAM is more practical in real-life classification tasks, compared to more elegant and sophisticated approaches like CW.
- Country:
- Asia > Singapore (0.05)
- South America > Paraguay
- Oceania > Australia
- New South Wales > Sydney (0.04)
- North America > United States
- New York > New York County > New York City (0.05)
- Europe > Italy
- Sardinia (0.04)
- Industry:
- Technology: