Goto

Collaborating Authors

 Bayesian Learning


Biclustering-Driven Ensemble of Bayesian Belief Network Classifiers for Underdetermined Problems

AAAI Conferences

In this paper, we present BENCH (BiclusteringdrivenENsemble of Classifiers), an algorithm toconstruct an ensemble of classifiers through concurrentfeature and data point selection guided byunsupervised knowledge obtained from biclustering.BENCH is designed for underdeterminedproblems. In our experiments, we use Bayesian BeliefNetwork (BBN) classifiers as base classifiers inthe ensemble; however, BENCH can be applied toother classification models as well. We show thatBENCH is able to increase prediction accuracy ofa single classifier and traditional ensemble of classifiersby up to 15% on three microarray datasetsusing various weighting schemes for combining individualpredictions in the ensemble.


Agent-Oriented Incremental Team and Activity Recognition

AAAI Conferences

Monitoring team activity is beneficial when human teams cooperate in the enactment of a joint plan. Monitoring allows teams to maintain awareness of each other's progress within the plan and it enables anticipation of information needs. Humans find this difficult, particularly in time-stressed and uncertain environments. In this paper we introduce a probabilistic model, based on Conditional Random Fields, to automatically recognise the composition of teams and the team activities in relation to a plan. The team composition and activities are recognised incrementally by interpreting a stream of spatio-temporal observations.


Multi-Label Classification Using Conditional Dependency Networks

AAAI Conferences

In this paper, we tackle the challenges of multi-label classification by developing a general conditional dependency network model. The proposed model is a cyclic directed graphical model, which provides an intuitive representation for the dependencies among multiple label variables, and a well integrated framework for efficient model training using binary classifiers and label predictions using Gibbs sampling inference. Our experiments show the proposed conditional model can effectively exploit the label dependency to improve multi-label classification performance.


Continuous Correlated Beta Processes

AAAI Conferences

In this paper we consider a (possibly continuous) space of Bernoulli experiments. We assume that the Bernoulli distributions of the points are correlated. All evidence data comes in the form of successful or failed experiments at different points. Current state-of-the-art methods for expressing a distribution over a continuum of Bernoulli distributions use logistic Gaussian processes or Gaussian copula processes. However, both of these require computationally expensive matrix operations (cubic in the general case). We introduce a more intuitive approach, directly correlating beta distributions by sharing evidence between them according to a kernel function, an approach which has linear time complexity. The approach can easily be extended to multiple outcomes, giving a continuous correlated Dirichlet process.This approach can be used for classification (both binary and multi-class) and learning the actual probabilities of the Bernoulli distributions. We show results for a number of data sets, as well as a case-study where a mixture of continuous beta processes is used as part of an automated stroke rehabilitation system.


Learning Decision Rules from Data Streams

AAAI Conferences

However, it has been shown that the antecedents of individual rules Decision rules, which can provide good interpretability may contain irrelevant conditions. C4.5rules (Quinlan, 1993) and flexibility for data mining tasks, uses an optimization procedure to simplify conditions. The have received very little attention in the stream optimization is done in two phases. First, each rule is generalized mining community so far. In this work we introduce by deleting conditions that do not seem to be helpful a new algorithm to learn rule sets, designed in discriminating the classes. A greedy search method is for open-ended data streams.


Improving Performance of Topic Models by Variable Grouping

AAAI Conferences

Topic models have a wide range of applications, including modeling of text documents, images, user preferences, product rankings, and many others. However, learning optimal models may be difficult, especially for large problems. The reason is that inference techniques such as Gibbs sampling often converge to suboptimal models due to the abundance of local minima in large datasets. In this paper, we propose a general method of improving the performance of topic models. The method, called 'grouping transform', works by introducing auxiliary variables which represent assignments of the original model tokens to groups. Using these auxiliary variables, it becomes possible to resample an entire group of tokens at a time. This allows the sampler to make larger state space moves. As a result, better models are learned and performance is improved. The proposed ideas are illustrated on several topic models and several text and image datasets. We show that the grouping transform significantly improves performance over standard models.


A Logic for Causal Inference in Time Series with Discrete and Continuous Variables

AAAI Conferences

Many applications of causal inference, such as finding the relationship between stock prices and news reports, involve both discrete and continuous variables observed over time. Inference with these complex sets of temporal data, though, has remained difficult and required a number of simplifications. We show that recent approaches for inferring temporal relationships (represented as logical formulas) can be adapted for inference with continuous valued effects. Building on advances in logic, PCTLc (an extension of PCTL with numerical constraints) is introduced here to allow representation and inference of relationships with a mixture of discrete and continuous components. Then, finding significant relationships in the continuous case can be done using the conditional expectation of an effect, rather than its conditional probability. We evaluate this approach on both synthetically generated and actual financial market data, demonstrating that it can allow us to answer different questions than the discrete approach can.


A Maximum Likelihood Approach Towards Aggregating Partial Orders

AAAI Conferences

In many of the possible applications as well as the theoretical models of computational social choice,the agentsโ€™ preferences are represented as partialorders. In this paper, we extend the maximum likelihood approach for defining โ€œoptimalโ€ voting rules to this setting. We consider distributions in which the pairwise comparisons / incomparabilities between alternatives are drawn i.i.d. We call suchmodels pairwise-independentmodels and show that they correspond to a class of voting rules that we call pairwise scoring rules. This generalizes rulessuch as Kemeny and Borda. Moreover, we show that Borda is the only pairwise scoring rule that satisfies neutrality, when the outcome space is the set of all alternatives. We then study which voting rules defined for linear orders can be extended to partial orders via our MLE model. We show that any weakly neutral outcome scoring rule (includingany ranking/candidate scoring rule) based onthe weighted majority graph can be represented as the MLE of a weakly neutral pairwise-independent model. Therefore, all such rules admit natural extensionsto profiles of partial orders. Finally, we propose a specific MLE model ฯ€ k for generating a set of k winning alternatives, and study the computational complexity of winner determination for the MLE of ฯ€ k .


On Learning Discrete Graphical Models Using Greedy Methods

arXiv.org Machine Learning

In this paper, we address the problem of learning the structure of a pairwise graphical model from samples in a high-dimensional setting. Our first main result studies the sparsistency, or consistency in sparsity pattern recovery, properties of a forward-backward greedy algorithm as applied to general statistical models. As a special case, we then apply this algorithm to learn the structure of a discrete graphical model via neighborhood estimation. As a corollary of our general result, we derive sufficient conditions on the number of samples n, the maximum node-degree d and the problem size p, as well as other conditions on the model parameters, so that the algorithm recovers all the edges with high probability. Our result guarantees graph selection for samples scaling as n = Omega(d^2 log(p)), in contrast to existing convex-optimization based algorithms that require a sample complexity of \Omega(d^3 log(p)). Further, the greedy algorithm only requires a restricted strong convexity condition which is typically milder than irrepresentability assumptions. We corroborate these results using numerical simulations at the end.


Viral Actions: Predicting Video View Counts Using Synchronous Sharing Behaviors

AAAI Conferences

In this article, we present a method for predicting the view count of a YouTube video using a small feature set collected from a synchronous sharing tool. We hypothesize that videos which have a high YouTube view count will exhibit a unique sharing pattern when shared in synchronous environments. Using a one-day sample of 2,188 dyadic sessions from the Yahoo! Zync synchronous sharing tool, we demonstrate how to predict the video's view count on YouTube, specifically if a video has over 10 million views. The prediction model is 95.8% accurate and done with a relatively small training set; only 15% of the videos had more than one session viewing; in effect, the classifier had a precision of 76.4% and a recall of 81%. We describe a prediction model that relies on using implicit social shared viewing behavior such as how many times a video was paused, rewound, or fast-forwarded as well as the duration of the session. Finally, we present some new directions for future virality research and for the design of future social media tools.