Goto

Collaborating Authors

 Performance Analysis


Multiple Operator-valued Kernel Learning

arXiv.org Machine Learning

Positive definite operator-valued kernels generalize the well-known notion of reproducing kernels, and are naturally adapted to multi-output learning situations. This paper addresses the problem of learning a finite linear combination of infinite-dimensional operator-valued kernels which are suitable for extending functional data analysis methods to nonlinear contexts. We study this problem in the case of kernel ridge regression for functional responses with an lr-norm constraint on the combination coefficients. The resulting optimization problem is more involved than those of multiple scalar-valued kernel learning since operator-valued kernels pose more technical and theoretical issues. We propose a multiple operator-valued kernel learning algorithm based on solving a system of linear operator equations by using a block coordinatedescent procedure. We experimentally validate our approach on a functional regression task in the context of finger movement prediction in brain-computer interfaces.


Small Sample Inference for Generalization Error in Classification Using the CUD Bound

arXiv.org Machine Learning

Confidence measures for the generalization error are crucial when small training samples are used to construct classifiers. A common approach is to estimate the generalization error by resampling and then assume the resampled estimator follows a known distribution to form a confidence set [Kohavi 1995, Martin 1996,Yang 2006]. Alternatively, one might bootstrap the resampled estimator of the generalization error to form a confidence set. Unfortunately, these methods do not reliably provide sets of the desired confidence. The poor performance appears to be due to the lack of smoothness of the generalization error as a function of the learned classifier. This results in a non-normal distribution of the estimated generalization error. We construct a confidence set for the generalization error by use of a smooth upper bound on the deviation between the resampled estimate and generalization error. The confidence set is formed by bootstrapping this upper bound. In cases in which the approximation class for the classifier can be represented as a parametric additive model, we provide a computationally efficient algorithm. This method exhibits superior performance across a series of test and simulated data sets.


Multi-View Learning in the Presence of View Disagreement

arXiv.org Machine Learning

Traditional multi-view learning approaches suffer in the presence of view disagreement,i.e., when samples in each view do not belong to the same class due to view corruption, occlusion or other noise processes. In this paper we present a multi-view learning approach that uses a conditional entropy criterion to detect view disagreement. Once detected, samples with view disagreement are filtered and standard multi-view learning methods can be successfully applied to the remaining samples. Experimental evaluation on synthetic and audio-visual databases demonstrates that the detection and filtering of view disagreement considerably increases the performance of traditional multi-view learning approaches.


Sparse Prediction with the $k$-Support Norm

arXiv.org Machine Learning

We derive a novel norm that corresponds to the tightest convex relaxation of sparsity combined with an $\ell_2$ penalty. We show that this new {\em $k$-support norm} provides a tighter relaxation than the elastic net and is thus a good replacement for the Lasso or the elastic net in sparse prediction problems. Through the study of the $k$-support norm, we also bound the looseness of the elastic net, thus shedding new light on it and providing justification for its use.


Soil Data Analysis Using Classification Techniques and Soil Attribute Prediction

arXiv.org Machine Learning

Agricultural research has been profited by technical advances such as automation, data mining. Today, data mining is used in a vast areas and many off-the-shelf data mining system products and domain specific data mining application soft wares are available, but data mining in agricultural soil datasets is a relatively a young research field. The large amounts of data that are nowadays virtually harvested along with the crops have to be analyzed and should be used to their full extent. This research aims at analysis of soil dataset using data mining techniques. It focuses on classification of soil using various algorithms available. Another important purpose is to predict untested attributes using regression technique, and implementation of automated soil sample classification.


Modeling Social Causality and Responsibility Judgment in Multi-Agent Interactions

Journal of Artificial Intelligence Research

Social causality is the inference an entity makes about the social behavior of other entities and self. Besides physical cause and effect, social causality involves reasoning about epistemic states of agents and coercive circumstances. Based on such inference, responsibility judgment is the process whereby one singles out individuals to assign responsibility, credit or blame for multi-agent activities. Social causality and responsibility judgment are a key aspect of social intelligence, and a model for them facilitates the design and development of a variety of multi-agent interactive systems. Based on psychological attribution theory, this paper presents a domain-independent computational model to automate social inference and judgment process according to an agents causal knowledge and observations of interaction. We conduct experimental studies to empirically validate the computational model. The experimental results show that our model predicts human judgments of social attributions and makes inferences consistent with what most people do in their judgments. Therefore, the proposed model can be generically incorporated into an intelligent system to augment its social and cognitive functionality.


Finding Important Genes from High-Dimensional Data: An Appraisal of Statistical Tests and Machine-Learning Approaches

arXiv.org Machine Learning

Over the past decades, statisticians and machine-learning researchers have developed literally thousands of new tools for the reduction of high-dimensional data in order to identify the variables most responsible for a particular trait. These tools have applications in a plethora of settings, including data analysis in the fields of business, education, forensics, and biology (such as microarray, proteomics, brain imaging), to name a few. In the present work, we focus our investigation on the limitations and potential misuses of certain tools in the analysis of the benchmark colon cancer data (2,000 variables; Alon et al., 1999) and the prostate cancer data (6,033 variables; Efron, 2010, 2008). Our analysis demonstrates that models that produce 100% accuracy measures often select different sets of genes and cannot stand the scrutiny of parameter estimates and model stability. Furthermore, we created a host of simulation datasets and "artificial diseases" to evaluate the reliability of commonly used statistical and data mining tools. We found that certain widely used models can classify the data with 100% accuracy without using any of the variables responsible for the disease. With moderate sample size and suitable pre-screening, stochastic gradient boosting will be shown to be a superior model for gene selection and variable screening from high-dimensional datasets.


Latent Multi-group Membership Graph Model

arXiv.org Machine Learning

We develop the Latent Multi-group Membership Graph (LMMG) model, a model of networks with rich node feature structure. In the LMMG model, each node belongs to multiple groups and each latent group models the occurrence of links as well as the node feature structure. The LMMG can be used to summarize the network structure, to predict links between the nodes, and to predict missing features of a node. We derive efficient inference and learning algorithms and evaluate the predictive performance of the LMMG on several social and document network datasets.


Graph-Based Anomaly Detection Applied to Homeland Security Cargo Screening

AAAI Conferences

Protecting our nation’s ports is a critical challenge for homeland security and requires the research, development and deployment of new technologies that will allow for the efficient securing of shipments entering this country. Most approaches look only at statistical irregularities in the attributes of the cargo, and not at the relationships of this cargo to others. However, anomalies detected in these relationships could add to the suspicion of the cargo, and therefore improve the accuracy with which we detect suspicious cargo. This paper proposes an improvement in our ability to detect suspicious cargo bound for the U.S. through a graph-based anomaly detection approach. Using anonymized data received from the Department of Homeland Security, we demonstrate the effectiveness of our approach and its usefulness to a homeland security analyst who is tasked with uncovering illegal and potentially dangerous cargo shipments.


Identifying Personality Types Using Document Classification Methods

AAAI Conferences

Are the words that people use indicative of their personality type preferences? In this paper, it is hypothesized that word-usage is not independent of personality type, as measured by the Myers-Briggs Type Indicator (MBTI) personality assessment tool. In-class writing samples were taken from 40 graduate students along with the MBTI. The experiment utilizes naïve Bayes classifiers and Support Vector Machines (SVMs) in an attempt to guess an individual’s personality type based on their word-choice. Classification is also attempted using emotional, social, cognitive, and psychological dimensions elicited by the analysis software, Linguistic Inquiry and Word Count (LIWC). The classifiers are evaluated with 40 distinct trials (leave-one-out cross validation), and parameters are chosen using leave-one-out cross validation of each trial’s training set. The experiment showed that the naïve Bayes classifiers (word-based and LIWC-based) outperformed the SVMs when guessing Sensing-Intuition (S-N) and Thinking-Feeling (T-F).