Statistical Learning
A Reliable Effective Terascale Linear Learning System
Agarwal, Alekh, Chapelle, Olivier, Dudik, Miroslav, Langford, John
We present a system and a set of techniques for learning linear predictors with convex losses on terascale datasets, with trillions of features, {The number of features here refers to the number of non-zero entries in the data matrix.} billions of training examples and millions of parameters in an hour using a cluster of 1000 machines. Individually none of the component techniques are new, but the careful synthesis required to obtain an efficient implementation is. The result is, up to our knowledge, the most scalable and efficient linear learning system reported in the literature (as of 2011 when our experiments were conducted). We describe and thoroughly evaluate the components of the system, showing the importance of the various design choices.
Model Predictive Control with Uncertainty in Human Driven Systems
Styler, Alexander David (Carnegie Mellon University) | Nourbakhsh, Illah Reza (Carnegie Mellon University)
Human driven systems present a unique optimization challenge for robot control. Generally, operators of these systems behave rationally given environmental factors and desired goals. However, information available to subsystem controllers is often incomplete, and the operator becomes more difficult to model without this input information. In this work we present a data-driven, nonparametric model to capture both expectation and uncertainty of the upcoming duty for a subsystem controller. This model is a modified k-nearest neighbor regressor used to generate weighted samples from a distribution of upcoming duty, which are then exploited to generate an optimal control. We test the model on a simulated heterogeneous energy pack manager in an Electric Vehicle operated by a human driver. For this domain, upcoming load on the energy pack strongly affects the optimal use and charging strategy of the pack. Given incomplete information, there is a natural uncertainty in upcoming duty due to traffic, destination, signage, and other factors. We test against a dataset of real driving data gathered from volunteers, and compare the results other models and the optimal upper bound.
Using Machine Learning to Improve Stochastic Optimization
Wolpert, David (Santa Fe Institute) | Rajnarayan, Dev (Sensor Platforms Inc.)
In many ย stochastic optimization algorithms there is a hyperparameter that controls how the next sampling distribution is determined from the current data set of samples of the objective function. This hyperparameter controls the exploration/exploitation trade-off of the next sample. Typically heuristic "rules of thumb" are used to set that hyperparameter, e.g., a pre-fixed annealing schedule. We show how machine learning provides more principled alternatives to (adaptively) set that hyperparameter, and demonstrate that these alternatives can substantially improve optimization performance.
Empirical Comparison of Multi-Label Classification Algorithms
Tawiah, Clifford (University of Central Arkansas) | Sheng, Victor (University of Central Arkansas)
Multi-label classifications exist in many real world applications. This paper empirically studies the performance of a variety of multi-label classification algorithms. Some of them are developed based on problem transformation. Some of them are developed based on adaption. Our experimental results show that the adaptive Multi-Label K-Nearest Neighbor performs the best, followed by Random k-Label Set, followed by Classifier Chain and Binary Relevance. Adaboost.MH performs the worst, followed by Pruned Problem Transformation. Our experimental results also provide us the confidence of the correlations among multi-labels. These insights shed light for future research directions on multi-label classifications.
Fast Algorithm for Modularity-Based Graph Clustering
Shiokawa, Hiroaki (Nippon Telegraph and Telephone Corporation) | Fujiwara, Yasuhiro (Nippon Telegraph and Telephone Corporation) | Onizuka, Makoto (Nippon Telegraph and Telephone Corporation)
In AI and Web communities, modularity-based graph clustering algorithms are being applied to various applications. However, existing algorithms are not applied to large graphs because they have to scan all vertices/edges iteratively. The goal of this paper is to efficiently compute clusters with high modularity from extremely large graphs with more than a few billion edges. The heart of our solution is to compute clusters by incrementally pruning unnecessary vertices/edges and optimizing the order of vertex selections. Our experiments show that our proposal outperforms all other modularity-based algorithms in terms of computation time, and it finds clusters with high modularity.
Supervised Nonnegative Tensor Factorization with Maximum-Margin Constraint
Wu, Fei (Zhejiang University) | Tan, Xu (Zhejiang University) | Yang, Yi (University of Queensland) | Tao, Dacheng (University of Technology, Sydney) | Tang, Siliang (Zhejiang University) | Zhuang, Yueting (Zhejiang University)
Non-negative tensor factorization (NTF) has attracted great attention in the machine learning community. In this paper, we extend traditional non-negative tensor factorization into a supervised discriminative decomposition, referred as Supervised Non-negative Tensor Factorization with Maximum-Margin Constraint(SNTFM2). SNTFM2 formulates the optimal discriminative factorization of non-negative tensorial data as a coupled least-squares optimization problem via a maximum-margin method. As a result, SNTFM2 not only faithfully approximates the tensorial data by additive combinations of the basis, but also obtains a strong generalization power to discriminative analysis (in particularfor classification in this paper). The experimental results show the superiority of our proposed model over state-of-the-art techniques on both toy and real world data sets.
Incremental Learning Framework for Indoor Scene Recognition
Kawewong, Aram (Chiang Mai University) | Pimup, Rapeeporn (Tokyo Institute of Technology) | Hasegawa, Osamu (Tokyo Institute of Technology)
This paper presents a novel framework for online incremental place recognition in an indoor environment. The framework addresses the scenario in which scene images are gradually obtained during long-term operation in the real-world indoor environment. Multiple users may interact with the classification system and confirm either current or past prediction results; the system then immediately updates itself to improve the classification system. This framework is based on the proposed \emph{n}-value self-organizing and incremental neural network (\emph{n}-SOINN), which has been derived by modifying the original SOINN to be appropriate for use in scene recognition. The evaluation was performed on the standard MIT 67-category indoor scene dataset and shows that the proposed framework achieves the same accuracy as that of the state-of-the-art offline method, while the computation time of the proposed framework is significantly faster and fully incremental update is allowed. Additionally, a small extra set of training samples is incrementally given to the system to simulate the incremental learning situation. The result shows that the proposed framework can leverage such additional samples and achieve the state-of-the-art result.
Does One-Against-All or One-Against-One Improve the Performance of Multiclass Classifications?
Eichelberger, Robert Kyle (University of Central Arkansas) | Sheng, Victor S. (Department of Computer Science, University of Central Arkansas)
One-against-all and one-against-one are two popular methodologies for reducing multiclass classification problems into a set of binary classifications. In this paper, we are interested in the performance of both one-against-all and one-against-one for classification algorithms, such as decision tree, naรฏve bayes, support vector machine, and logistic regression. Since both one-against-all and one-against-one work like creating a classification committee, they are expected to improve the performance of classification algorithms. However, our experimental results surprisingly show that one-against-all worsens the performance of the algorithms on most datasets. One-against-one helps, but performs worse than the same iterations of bagging these algorithms. Thus, we conclude that both one-against-all and one-against-one should not be used for the algorithms that can perform multiclass classifications directly. Bagging is better approach for improving their performance.
Accuracy and Timeliness in ML Based Activity Recognition
Ross, Robert (Dublin Institute of Technology) | Kelleher, John (Dublin Institute of Technology)
While recent Machine Learning (ML) based techniques for activity recognition show great promise, there remain a number of questions with respect to the relative merits of these techniques. To provide a better understanding of the relative strengths of contemporary Activity Recognition methods, in this paper we present a comparative analysis of Hidden Markov Model, Bayesian, and Support Vector Machine based human activity recognition models. The study builds on both pre-existing and newly annotated data which includes interleaved activities. Results demonstrate that while Support Vector Machine based techniques perform well for all data sets considered, simple representations of sensor histories regularly outperform more complex count based models.