Accuracy
Neurogenesis-Inspired Dictionary Learning: Online Model Adaption in a Changing World
Garg, Sahil, Rish, Irina, Cecchi, Guillermo, Lozano, Aurelie
In this paper, we focus on online representation learning in non-stationary environments which may require continuous adaptation of model architecture. We propose a novel online dictionary-learning (sparse-coding) framework which incorporates the addition and deletion of hidden units (dictionary elements), and is inspired by the adult neurogenesis phenomenon in the dentate gyrus of the hippocampus, known to be associated with improved cognitive function and adaptation to new environments. In the online learning setting, where new input instances arrive sequentially in batches, the neuronal-birth is implemented by adding new units with random initial weights (random dictionary elements); the number of new units is determined by the current performance (representation error) of the dictionary, higher error causing an increase in the birth rate. Neuronal-death is implemented by imposing l1/l2-regularization (group sparsity) on the dictionary within the block-coordinate descent optimization at each iteration of our online alternating minimization scheme, which iterates between the code and dictionary updates. Finally, hidden unit connectivity adaptation is facilitated by introducing sparsity in dictionary elements. Our empirical evaluation on several real-life datasets (images and language) as well as on synthetic data demonstrates that the proposed approach can considerably outperform the state-of-art fixed-size (nonadaptive) online sparse coding of Mairal et al. (2009) in the presence of nonstationary data. Moreover, we identify certain properties of the data (e.g., sparse inputs with nearly non-overlapping supports) and of the model (e.g., dictionary sparsity) associated with such improvements.
Probing for sparse and fast variable selection with model-based boosting
Thomas, Janek, Hepp, Tobias, Mayr, Andreas, Bischl, Bernd
We present a new variable selection method based on model-based gradient boosting and randomly permuted variables. Model-based boosting is a tool to fit a statistical model while performing variable selection at the same time. A drawback of the fitting lies in the need of multiple model fits on slightly altered data (e.g. cross-validation or bootstrap) to find the optimal number of boosting iterations and prevent overfitting. In our proposed approach, we augment the data set with randomly permuted versions of the true variables, so called shadow variables, and stop the step-wise fitting as soon as such a variable would be added to the model. This allows variable selection in a single fit of the model without requiring further parameter tuning. We show that our probing approach can compete with state-of-the-art selection methods like stability selection in a high-dimensional classification benchmark and apply it on gene expression data for the estimation of riboflavin production of Bacillus subtilis.
Case-Study: Better HAAR feature-based Eye Detector using OpenCV » CV-Tricks.com
Opencv object detectors which are built using Haar feature-based cascade classifiers is at least a decade old. OpenCV framework provides a default pre-built haar and lbp based cascade classifiers for face and eye detection which are very good quality detectors. However, I had never measured the accuracy of these face and eye detectors. I recently discovered that pre-built haar/lbp cascades have a relatively higher false positive rates which might make them unsuitable for many use-cases. It's possible to build an eye detector with very high accuracy and low false positive rates for many cases with OpenCV.
Cost-Sensitive Feature Selection via F-Measure Optimization Reduction
Liu, Meng (Peking University) | Xu, Chang (University of Technology, Sydney) | Luo, Yong (Nanyang Technological University) | Xu, Chao (Peking University) | Wen, Yonggang (Nanyang Technological University) | Tao, Dacheng (University of Technology, Sydney)
Feature selection aims to select a small subset from the high-dimensional features which can lead to better learning performance, lower computational complexity, and better model readability. The class imbalance problem has been neglected by traditional feature selection methods, therefore the selected features will be biased towards the majority classes. Because of the superiority of F-measure to accuracy for imbalanced data, we propose to use F-measure as the performance measure for feature selection algorithms. As a pseudo-linear function, the optimization of F-measure can be achieved by minimizing the total costs. In this paper, we present a novel cost-sensitive feature selection (CSFS) method which optimizes F-measure instead of accuracy to take class imbalance issue into account. The features will be selected according to optimal F-measure classifier after solving a series of cost-sensitive feature selection sub-problems. The features selected by our method will fully represent the characteristics of not only majority classes, but also minority classes. Extensive experimental results conducted on synthetic, multi-class and multi-label datasets validate the efficiency and significance of our feature selection method.
Adverse Drug Reaction Prediction with Symbolic Latent Dirichlet Allocation
Xiao, Cao (IBM T.J.Watson Research Center) | Zhang, Ping (IBM T.J.Watson Research Center) | Chaovalitwongse, W. Art (University of Arkansas) | Hu, Jianying (IBM T.J.Watson Research Center) | Wang, Fei (Cornell University)
Adverse drug reaction (ADR) is a major burden for patients and healthcare industry. It usually causes preventable hospitalizations and deaths, while associated with a huge amount of cost. Traditional preclinical in vitro safety profiling and clinical safety trials are restricted in terms of small scale, long duration, huge financial costs and limited statistical signifi- cance. The availability of large amounts of drug and ADR data potentially allows ADR predictions during the drugs’ early preclinical stage with data analytics methods to inform more targeted clinical safety tests. Despite their initial success, existing methods have trade-offs among interpretability, predictive power and efficiency. This urges us to explore methods that could have all these strengths and provide practical solutions for real world ADR predictions. We cast the ADR-drug relation structure into a three-layer hierarchical Bayesian model. We interpret each ADR as a symbolic word and apply latent Dirichlet allocation (LDA) to learn topics that may represent certain biochemical mechanism that relates ADRs with drug structures. Based on LDA, we designed an equivalent regularization term to incorporate the hierarchical ADR domain knowledge. Finally, we developed a mixed input model leveraging a fast collapsed Gibbs sampling method that the complexity of each iteration of Gibbs sampling proportional only to the number of positive ADRs. Experiments on real world data show our models achieved higher prediction accuracy and shorter running time than the state-of-the-art alternatives.
Bootstrapping Distantly Supervised IE Using Joint Learning and Small Well-Structured Corpora
Bing, Lidong (Tencent Inc.) | Dhingra, Bhuwan (Carnegie Mellon University) | Mazaitis, Kathryn (Carnegie Mellon University) | Park, Jong Hyuk (Carnegie Mellon University) | Cohen, William W. (Carnegie Mellon University)
We propose a framework to improve the performance of distantly-supervised relation extraction, by jointly learning to solve two related tasks: concept-instance extraction and relation extraction. We further extend this framework to make a novel use of document structure: in some small, well-structured corpora, sections can be identified that correspond to relation arguments, and distantly-labeled examples from such sections tend to have good precision. Using these as seeds we extract additional relation examples by applying label propagation on a graph composed of noisy examples extracted from a large unstructured testing corpus. Combined with the soft constraint that concept examples should have the same type as the second argument of the relation, we get significant improvements over several state-of-the-art approaches to distantly-supervised relation extraction, and reasonable extraction performance even with very small set of distant labels.
Multimodal Fusion of EEG and Musical Features in Music-Emotion Recognition
Thammasan, Nattapong (Osaka University) | Fukui, Ken-ichi (Osaka University) | Numao, Masayuki (Osaka University)
Multimodality has been recently exploited to overcome the challenges of emotion recognition. In this paper, we present a study of fusion of electroencephalogram (EEG) features and musical features extracted from musical stimuli at decision level in recognizing the time-varying binary classes of arousal and valence. Our empirical results demonstrate that EEG modality was suffered from the non-stability of EEG signals, yet fusing with music modality could alleviate the issue and enhance the performance of emotion recognition.
Bootstrapping with Models: Confidence Intervals for Off-Policy Evaluation
Hanna, Josiah P. (The University of Texas at Austin) | Stone, Peter (The University of Texas at Austin) | Niekum, Scott (The University of Texas at Austin)
In many reinforcement learning applications, it is desirable to determine confidence interval lower bounds on the performance of any given policy without executing said policy. In this context, we propose two bootstrapping off-policy evaluation methods which use learned MDP transition models in order to estimate lower confidence bounds on policy performance with limited data. We empirically evaluate the proposed methods in a standard policy evaluation tasks.
Learning to Predict Intent from Gaze During Robotic Hand-Eye Coordination
Razin, Yosef (Georgia Institute of Technology) | Feigh, Karen (Georgia Institute of Technology)
Effective human-aware robots should anticipate their user’s intentions. During hand-eye coordination tasks, gaze often precedes hand motion and can serve as a powerful predictor for intent. However, cooperative tasks where a semi-autonomous robot serves as an extension of the human hand have rarely been studied in the context of hand-eye coordination. We hypothesize that accounting for anticipatory eye movements in addition to the movements of the robot will improve intent estimation. This research compares the application of various machine learning methods to intent prediction from gaze tracking data during robotic hand-eye coordination tasks. We found that with proper feature selection, accuracies exceeding 94% and AUC greater than 91% are achievable with several classification algorithms but that anticipatory gaze data did not improve intent prediction.
Species Distribution Modeling of Citizen Science Data as a Classification Problem with Class-Conditional Noise
Hutchinson, Rebecca A. (Oregon State University) | He, Liqiang (Oregon State University) | Emerson, Sarah C. (Oregon State University)
Species distribution models relate the geographic occurrence pattern of a species to environmental features and are used for a variety of scientific and management purposes. One source of data for building species distribution models is citizen science, in which volunteers report locations where they observed (or did not observe) sets of species. Since volunteers have variable levels of expertise, citizen science data may contain both false positives and false negatives in the location labels (present vs. absent) they provide, but many common modeling approaches for this task do not address these sources of noise explicitly. In this paper, we propose to formulate the species distribution modeling task as a classification problem with class-conditional noise. Our approach builds on other applications of class-conditional noise models to crowdsourced data, but we focus on leveraging features of the noise processes that are distinct from the class features. We describe the conditions under which the parameters of our proposed model are identifiable and apply it to simulated data and data from the eBird citizen science project.