Goto

Collaborating Authors

 Bayesian Learning


Characterizing predictable classes of processes

arXiv.org Artificial Intelligence

The problem is sequence prediction in the following setting. A sequence $x_1,...,x_n,...$ of discrete-valued observations is generated according to some unknown probabilistic law (measure) $\mu$. After observing each outcome, it is required to give the conditional probabilities of the next observation. The measure $\mu$ belongs to an arbitrary class $\C$ of stochastic processes. We are interested in predictors $\rho$ whose conditional probabilities converge to the "true" $\mu$-conditional probabilities if any $\mu\in\C$ is chosen to generate the data. We show that if such a predictor exists, then a predictor can also be obtained as a convex combination of a countably many elements of $\C$. In other words, it can be obtained as a Bayesian predictor whose prior is concentrated on a countable set. This result is established for two very different measures of performance of prediction, one of which is very strong, namely, total variation, and the other is very weak, namely, prediction in expected average Kullback-Leibler divergence.



Multiagent Bayesian Forecasting of Time Series with Graphical Models

AAAI Conferences

Time series are found widely in engineering and science.  We study multiagent forecasting in time series, drawing from literature on time series, graphical models, and multiagent systems.  Knowledge representation of our agents is based on dynamic multiply sectioned Bayesian networks (DMSBNs), a class of cooperative multiagent graphical models.  We propose a method through which agents can perform one-step forecast with exact probabilistic inference.  Superior performance of our agents over agents based on dynamic Bayesian networks (DBNs) are demonstrated through experiment.


Constraint-based Approach to Discovery of Inter Module Dependencies in Modular Bayesian Networks

AAAI Conferences

This paper introduces an information theoretic approach to verification of modular causal probabilistic models. We assume systems which are gradually extended by adding new functional modules, each having a limited domain knowledge captured by a local Bayesian network. Different modules originate from independent design processes. We assume that the local models are correct, which, however does not guarantee globally coherent inference in composed systems. The introduced method supports discovery of significant inter module dependencies which are ignored in the assembled Bayesian network.


Join Tree Propagation Utilizing Both Arc Reversal and Variable Elimination

AAAI Conferences

In this paper, we put forth the first join tree propagation algorithm  that selectively applies either arc reversal (AR) or variable elimination (VE) to build the propagated messages. Our approach utilizes a recent method for identifying the propagated join tree messages \`{a} priori. When it is determined that precisely one message is to be constructed at a join tree node, VE is utilized to build this distribution; otherwise, AR is applied as it is better suited to construct multiple distributions passed between  neighboring join tree nodes. Experimental results, involving evidence processing in  seven real-world and one benchmark Bayesian network,  empirically demonstrate that selectively applying VE and AR is faster than applying one of these methods exclusively on the entire network.


Identifying User Destinations in VirtualWorlds

AAAI Conferences

This paper focuses on the identification of human activity patterns in SecondLife (SL), a user-constructed virtual environment.SecondLife allows the users to create a virtual avatar,explore areas constructed by other users, socialize, and conduct financial transactions just as one would in the real world.However unlike the real world, new attractions can be constructed within hours and previous ones often fall into disuse rapidly. Without current information about the state of regions in the virtual world, it is difficult to infer the purpose of the user’s actions from location information. In this paper,we present an approach for gathering data on users’ activities and building a map of SecondLife annotated with information about activities that the users were able to perform in each region. Using this map, a recommender agent built into the user’s heads-up display can present suggestions of other areas to visit based on data collected from previous users. We discuss the the use of five supervised classifiers and report classification results for the map construction portion of the agent.


Confidence-based Tuning of Nomogram Predictions

AAAI Conferences

Instance classification using machine learning techniques has numerous applications, from automation to medical diagnosis. In many problem domains, such as spam filtering, classification must be performed quickly across large datasets. In this paper we begin with machine learning techniques based on the naive Bayes classification and attempt to improve classification performance by taking into account attribute confidence intervals.  Our prediction functions operate over nominal datasets and retain the asymptotic complexity of one-pass learning and prediction functions. We present preliminary results indicating a modest, albeit inconsistent improvement over the naive Bayes classifier alone.


VipBoost: A More Accurate Boosting Algorithm

AAAI Conferences

Boosting is a well-known method for improving the accuracy of many learning algorithms. In this paper, we propose a novel boosting algorithm, VipBoost (voting on boosting classifications from imputed learning sets), which first generates multiple incomplete datasets from the original dataset by randomly removing a small percentage of observed attribute values, then uses an imputer to fill in the missing values.  It then applies AdaBoost (using some base learner) to produce classifiers trained on each of the imputed learning sets, to produce multiple classifiers. The subsequent prediction on a new test case is the most frequent classification from these classifiers. Our empirical results show that VipBoost produces very effective classifiers that significantly improve accuracy for unstable base learners and some stable learners, especially when the initial dataset is incomplete.


Multivariate Time Series Classification with Temporal Abstractions

AAAI Conferences

The increase in the number of complex temporal datasets collected today has prompted the development of methods that extend classical machine learning and data mining methods to time-series data.  This work focuses on methods for multivariate time-series classification. Time series classification is a challenging problem mostly because the number of temporal features that describe the data and are potentially useful for classification is enormous. We study and develop a temporal abstraction framework for generating multivariate time series features suitable for classification tasks. We propose the STF-Mine algorithm that automatically mines discriminative temporal abstraction patterns from the time series data and uses them to learn a classification model. Our experimental evaluations, carried out on both synthetic and real world medical data, demonstrate the benefit of our approach in learning accurate classifiers for time-series datasets.


A Large Margin Approach to Anaphora Resolution for Neuroscience Knowledge Discovery

AAAI Conferences

A discriminative large margin classifier based approach to anaphora resolution for neuroscience abstracts is presented. The system employs both syntactic and semantic features. A support vector machine based word sense disambiguation method combining evidence from three methods, that use WordNet and Wikipedia, is also introduced and used for semantic features. The support vector machine anaphora resolution classifier with probabilistic outputs achieved almost four-fold improvement in accuracy over the baseline method.