Goto

Collaborating Authors

 Directed Networks


Learning Structured Outputs from Partial Labels using Forest Ensemble

arXiv.org Machine Learning

Learning Structured Outputs from Partial Labels using Forest Ensemble Truyen Tran, Dinh Phung, Svetha V enkatesh Centre for Pattern Recognition and Data Analytics Deakin University, Australia Abstract Learning structured outputs with general structures is computationally challenging, except for tree-structured models. Thus we propose an efficient boosting-based algorithm AdaBoost.MRF for this task. The idea is based on the realization that a graph is a superimposition of trees. Different from most existing work, our algorithm can handle partial labelling, and thus is particularly attractive in practice where reliable labels are often sparsely observed. In addition, our method works exclusively on trees and thus is guaranteed to converge. We apply the AdaBoost.MRF algorithm to an indoor video surveillance scenario, where activities are modelled at multiple levels. 1 Introduction There has been a growing research interest in developing probabilistic temporal graphical models for recognising human activities from sensory data. In this paper we address an important aspect of the problem in that there are multiple levels of abstraction, that is, an activity is often composed of several sub-activities. A popular approach to deal with such a hierarchical nature is to build a cascaded model: each level is modelled separately, and the output of the lower levels is subsequently used as the input for the upper levels [20]. This approach is sub-optimal because the information at the higher level is often very discriminative to infer about the lower levels, but it is not modelled. Moreover, the layered approach often suffers from the so-called cascading error problem, as the error introduced from the lower level will propagate to higher tasks. A better and more holistic approach is to build a joint representation at all layers. Emerging methods include generative/directed models such as abstract hidden Markov models (AH-MMs) [4], hierarchical HMMs [19], dynamic Bayesian networks [10], and their discriminative/undirected counterparts such as hierarchical conditional random field (HCRF) [17], and dynamic CRF (DCRF) [28].


Exact fit of simple finite mixture models

arXiv.org Machine Learning

How to forecast next year's portfolio-wide credit default rate based on last year's default observations and the current score distribution? A classical approach to this problem consists of fitting a mixture of the conditional score distributions observed last year to the current score distribution. This is a special (simple) case of a finite mixture model where the mixture components are fixed and only the weights of the components are estimated. The optimum weights provide a forecast of next year's portfolio-wide default rate. We point out that the maximum-likelihood (ML) approach to fitting the mixture distribution not only gives an optimum but even an exact fit if we allow the mixture components to vary but keep their density ratio fix. From this observation we can conclude that the standard default rate forecast based on last year's conditional default rates will always be located between last year's portfolio-wide default rate and the ML forecast for next year. As an application example, then cost quantification is discussed. We also discuss how the mixture model based estimation methods can be used to forecast total loss. This involves the reinterpretation of an individual classification problem as a collective quantification problem.


A Bayesian Approach to Determine Focus of Attention in Spatial and Time-Sensitive Decision Making Scenarios

AAAI Conferences

Complex decision making scenarios require maintaining high level of concentration and acquiring knowledge about the context of the task in hand. Focus of attention is not only affected by contextual factors but also by the way operators interact with the information. Conversely, determining optimal ways to interact with this information can augment operatorsโ€™ cognition. However, challenges exist for determining efficient mathematical frameworks and sound metrics to infer, reason and assess the level of attention during spatio-temporal complex problem solving in hybrid human-machine systems. This paper proposes a computational framework based on a Bayesian approach (BAN) to infer usersโ€™ focus of attention based on physical expression generated from embodied interaction and further support decision-making in an unobtrusive manner. Experiments involving five interaction modalities (vision-based gesture interaction, glove-based gesture interaction, speech, feet, and body balance) were conducted to assess the proposed frameworkโ€™s feasibility including the likelihood of assessed attention from enhanced BAN and task performance. Results confirm that physical expressions have a determining effect in the quality of the solutions in spatio-navigational type of problems.


Using Dynamic Bayesian Networks for Incorporating Non-Traditional Data Sources in Public Health Surveillance

AAAI Conferences

It is generally challenging to obtain the exact disease prevalence, as the true cases of a disease in the population level are not easy to identify. Available and relevant data sources such as administrative or clinical health data are used in public health surveillance as a proxy to estimate the disease prevalence. Traditionally, these data sources span through healthcare utilization information such as emergency department visits, pharmacy drug sales, or laboratory test orders. In addition to incompleteness, these data sources are not usually available in a timely manner. Timeliness is an important factor for prevalence estimation for some conditions such as infectious diseases, especially at the time of an epidemic. For instance, in an influenza pandemic such estimates must be obtained within a day or two. In recent years several non-clinical and non-traditional data sources have been introduced to public health with the potentials to provide signals on a disease rate or to provide a feedback on the trends of a disease. Ideally, combining these new sources with the ones routinely used should help to identify disease cases more efficiently. However, building a construct capable of incorporating data from these various sources in a coherent manner is not trivial. In this research, we consider the case of H1N1 pandemic as the infectious disease of interest and we use media reports of deaths from H1N1 on the web as a non traditional data source. We propose to use dynamic Bayesian networks from the class of probabilistic graphical models in order to combine this new data source with traditional ones through exploration of the possible probabilistic relationships between these data streams. This is an initial step towards building a framework that can potentially support aggregation of heterogeneous data for a real-time estimation of a disease prevalence. Our preliminary results show that the proposed model generalizes well.


Representation, Reasoning, and Learning for a Relational Influence Diagram Applied to a Real-Time Geological Domain

AAAI Conferences

Mining companies typically process all the material extracted from a mine site using processes which are extremely consumptive of energy and chemicals. Sorting the good material from the bad would effectively reduce required resources by leaving behind the bad material and only transporting and processing the good material. We use a relational influence diagram with an explicit utility model applied to the scenario in which an unknown number of rocks in unknown positions with unknown mineral compositions pass over 7 sensors toward 7 diverters on a high-throughput rock-sorting machine developed by MineSense Technologies Ltd. After receiving noisy sensor data, the system has 400 ms to decide whether to activate diverters which will divert the rocks into either a keep or discard bin. We learn the model offline and do online inference. Our result improves over the current state-of-the-art.


Reasoning in the Description Logic BEL Using Bayesian Networks

AAAI Conferences

We study the problem of reasoning in the probabilistic Description Logic BEL. Using a novel structure, we show that probabilistic reasoning in this logic can be reduced in polynomial time to standard inferences over a Bayesian network. This reduction provides tight complexity bounds for probabilistic reasoning in BEL.


Resolution-limit-free and local Non-negative Matrix Factorization quality functions for graph clustering

arXiv.org Machine Learning

Many graph clustering quality functions suffer from a resolution limit, the inability to find small clusters in large graphs. So called resolution-limit-free quality functions do not have this limit. This property was previously introduced for hard clustering, that is, graph partitioning. We investigate the resolution-limit-free property in the context of Non-negative Matrix Factorization (NMF) for hard and soft graph clustering. To use NMF in the hard clustering setting, a common approach is to assign each node to its highest membership cluster. We show that in this case symmetric NMF is not resolution-limit-free, but that it becomes so when hardness constraints are used as part of the optimization. The resulting function is strongly linked to the Constant Potts Model. In soft clustering, nodes can belong to more than one cluster, with varying degrees of membership. In this setting resolution-limit-free turns out to be too strong a property. Therefore we introduce locality, which roughly states that changing one part of the graph does not affect the clustering of other parts of the graph. We argue that this is a desirable property, provide conditions under which NMF quality functions are local, and propose a novel class of local probabilistic NMF quality functions for soft graph clustering.


Bayesian Nonparametric Crowdsourcing

arXiv.org Machine Learning

Crowdsourcing has been proven to be an effective and efficient tool to annotate large datasets. User annotations are often noisy, so methods to combine the annotations to produce reliable estimates of the ground truth are necessary. We claim that considering the existence of clusters of users in this combination step can improve the performance. This is especially important in early stages of crowdsourcing implementations, where the number of annotations is low. At this stage there is not enough information to accurately estimate the bias introduced by each annotator separately, so we have to resort to models that consider the statistical links among them. In addition, finding these clusters is interesting in itself as knowing the behavior of the pool of annotators allows implementing efficient active learning strategies. Based on this, we propose in this paper two new fully unsupervised models based on a Chinese Restaurant Process (CRP) prior and a hierarchical structure that allows inferring these groups jointly with the ground truth and the properties of the users. Efficient inference algorithms based on Gibbs sampling with auxiliary variables are proposed. Finally, we perform experiments, both on synthetic and real databases, to show the advantages of our models over state-of-the-art algorithms.


Church: a language for generative models

arXiv.org Artificial Intelligence

We introduce Church, a universal language for describing stochastic generative processes. Church is based on the Lisp model of lambda calculus, containing a pure Lisp as its deterministic subset. The semantics of Church is defined in terms of evaluation histories and conditional distributions on such histories. Church also includes a novel language construct, the stochastic memoizer, which enables simple description of many complex non-parametric models. We illustrate language features through several examples, including: a generalized Bayes net in which parameters cluster over trials, infinite PCFGs, planning by inference, and various non-parametric clustering models. Finally, we show how to implement query on any Church program, exactly and approximately, using Monte Carlo techniques.


Labeling Complicated Objects: Multi-View Multi-Instance Multi-Label Learning

AAAI Conferences

Multi-Instance Multi-Label (MIML) is a learning framework where an example is associated with multiple labels and represented by a set of feature vectors (multiple instances). In the formalization of MIML learning, instances come from a single source (single view). To leverage multiple information sources (multi-view), we develop a multi-view MIML framework based on hierarchical Bayesian Network, and derive an effective learning algorithm based on variational inference. The model can naturally deal with examples in which some views could be absent (partial examples). On multi-view datasets, it is shown that our method is better than other multi-view and single-view approaches particularly in the presence of partial examples. On single-view benchmarks, extensive evaluation shows that our method is highly competitive or better than other MIML approaches on labeling examples and instances. Moreover, our method can effectively handle datasets with a large number of labels.