Directed Networks
Estimating the Probability of Meeting a Deadline in Hierarchical Plans
Cohen, Liat (Ben Gurion University of the Negev) | Shimony, Solomon Eyal (Ben Gurion University of the Negev) | Weiss, Gera (Ben Gurion University of the Negev)
Given a hierarchical plan (or schedule) with uncertain task times, we may need to determine the probability that a given plan will satisfy a given deadline. This problem is shown to be NP-hard for series-parallel hierarchies. We provide a polynomial-time approximation algorithm for it. Computing the expected makespan of an hierarchical plan is also shown to be NP-hard. We examine the approximation bounds empirically and demonstrate where our scheme is superior to sampling and to exact computation.
On Conceptual Labeling of a Bag of Words
Sun, Xiangyan (Fudan University) | Xiao, Yanghua (Fudan University) | Wang, Haixun (Google Research) | Wang, Wei (Fudan University)
In natural language processing and information retrieval, the bag of words representation is used to implicitly represent the meaning of the text. Implicit semantics, however, are insufficient in supporting text or natural language based interfaces, which are adopted by an increasing number of applications. Indeed, in applications ranging from automatic ontology construction to question answering, explicit representation of semantics is starting to play a more prominent role. In this paper, we introduce the task of conceptual labeling (CL), which aims at generating a minimum set of conceptual labels that best summarize a bag of words. We draw the labels from a data driven semantic network that contains millions of highly connected concepts. The semantic network provides meaning to the concepts, and in turn, it provides meaning to the bag of words through the conceptual labels we generate. To achieve our goal, we use an information theoretic approach to trade-off the semantic coverage of a bag of words against the minimality of the output labels. Specifically, we use Minimum Description Length (MDL) as the criteria in selecting the best concepts. Our extensive experimental results demonstrate the effectiveness of our approach in representing the explicit semantics of a bag of words.
Revisiting Gaussian Process Dynamical Models
Zhao, Jing (East China Normal University) | Sun, Shiliang (East China Normal University)
The recently proposed Gaussian process dynamical models (GPDMs) have been successfully applied to time series modeling. There are four learning algorithms for GPDMs: maximizing a posterior (MAP), fixing the kernel hyperparameters α _ (Fix.α _ ), balanced GPDM (B-GPDM) and two-stage MAP (T.MAP), which are designed for model training with complete data. When data are incomplete, GPDMs reconstruct the missing data using a function of the latent variables before parameter updates, which, however, may cause cumulative errors. In this paper, we present four new algorithms (MAP+, Fix.α + , B-GPDM+ and T.MAP+) for learning GPDMs with incomplete training data and a new conditional model (CM+) for recovering incomplete test data. Our methods adopt the Bayesian framework and can fully and properly use the partially observed data. We conduct experiments on incomplete motion capture data (walk, run, swing and multiple-walker) and make comparisons with the existing four algorithms as well as k-NN, spline interpolation and VGPDS. Our methods perform much better on both training with incomplete data and recovering incomplete test data.
Greedy Structure Search for Sum-Product Networks
Dennis, Aaron (Brigham Young University) | Ventura, Dan (Brigham Young University)
Sum-product networks (SPNs) are rooted, directed acyclic graphs (DAGs) of sum and product nodes with well-defined probabilistic semantics. Moreover, exact inference in the distribution represented by an SPN is guaranteed to take linear time in the size of the DAG. In this paper we introduce an algorithm that learns the structure of an SPN using a greedy search approach. It incorporates methods used in a previous SPN structure-learning algorithm, but, unlike the previous algorithm, is not limited to learning tree-structured SPNs. Several proven ideas from circuit complexity theory along with our experimental results provide evidence for the advantages of SPNs with less-restrictive, non-tree structures.
Biclustering Gene Expressions Using Factor Graphs and the Max-Sum Algorithm
Denitto, Matteo (University of Verona) | Farinelli, Alessandro (University of Verona) | Bicego, Manuele (University of Verona)
Biclustering is an intrinsically challenging and highly complex problem, particularly studied in the biology field, where the goal is to simultaneously cluster genes and samples of an expression data matrix. In this paper we present a novel approach to gene expression biclustering by providing a binary Factor Graph formulation to such problem. In more detail, we reformulate biclustering as a sequential search for single biclusters and use an efficient optimization procedure based on the Max Sum algorithm. Such approach, drastically alleviates the scaling issues of previous approaches for biclustering based on Factor Graphs obtaining significantly more accurate results on synthetic datasets. A further analysis on two real-world datasets confirms the potentials of the proposed methodology when compared to alternative state of the art methods.
The Complexity of MAP Inference in Bayesian Networks Specified Through Logical Languages
Maua, Denis Deratani (Universidade de Sao Paulo) | Campos, Cassio Polpo de (Queen's University Belfast) | Cozman, Fabio Gagliardi (Universidade de Sao Paulo)
We study the computational complexity of finding maximum a posteriori configurations in Bayesian networks whose probabilities are specified by logical formulas. This approach leads to a fine grained study in which local information such as context-sensitive independence and determinism can be considered. It also allows us to characterize more precisely the jump from tractability to NP-hardness and beyond, and to consider the complexity introduced by evidence alone.
A Fast Goal Recognition Technique Based on Interaction Estimates
E-Martin, Yolanda (Universities Space Research Association) | R-Moreno, Maria D. (Universidad de Alcala) | Smith, David E. (NASA Ames Research Center)
Goal Recognition is the task of inferring an actor's goals given some or all of the actor's observed actions. There is considerable interest in Goal Recognition for use in intelligent personal assistants, smart environments, intelligent tutoring systems, and monitoring user's needs. In much of this work, the actor's observed actions are compared against a generated library of plans. Recent work by Ramirez and Geffner makes use of AI planning to determine how closely a sequence of observed actions matches plans for each possible goal. For each goal, this is done by comparing the cost of a plan for that goal with the cost of a plan for that goal that includes the observed actions. This approach yields useful rankings, but is impractical for real-time goal recognition in large domains because of the computational expense of constructing plans for each possible goal. In this paper, we introduce an approach that propagates cost and interaction information in a plan graph, and uses this information to estimate goal probabilities. We show that this approach is much faster, but still yields high quality results.
Bayesian Modelling of Community-Based Multidimensional Trust in Participatory Sensing under Data Sparsity
Venanzi, Matteo (University of Southampton) | Teacy, Luke (University of Southampton) | Rogers, Alex (University of Southampton) | Jennings, Nick (University of Southampton)
We propose a new Bayesian model for reliable aggregatio of crowdsourced estimates of real-valued quantities in participatory sensing applications. Existing approaches focus on probabilistic modelling of user’s reliability as the key to accurate aggregation. However, these are either limited to estimating discrete quantities, or require a significant number of reports from each user to accurately model their reliability. To mitigate these issues, we adopt a community-based approach, which reduces the data required to reliably aggregate real-valued estimates, by leveraging correlations between the reporting behaviour of users belonging to different communities. As a result, our method is up to 16.6% more accurate than existing state-of-the-art methods and is up to 49% more effective under data sparsity when used to estimate Wi-Fi hotspot locations in a real-world crowdsourcing application.
Differential Semantics of Intervention in Bayesian Networks
Qin, Biao (Renmin University of China)
Differentiation is an important inference method in Bayesian networks and intervention is a basic notion in causal Bayesian networks. In this paper, we reveal the connection between differentiation and intervention in Bayesian networks. We first encode an intervention as changing a conditional probabilistic table into a partial intervention table. We next introduce a jointree algorithm to compute the full atomic interventions of all nodes with respect to evidence in a Bayesian network. We further discover that an intervention has differential semantics if the intervention variables can reach the evidence in Bayesian networks and the output of the state-of-the-art algorithm is not the differentiation but the intervention of a Bayesian network if the differential nodes cannot reach any one of the evidence nodes. Finally, we present experimental results to demonstrate the efficiency of our algorithm to infer the causal effect in Bayesian networks.
Indirect Causes in Dynamic Bayesian Networks Revisited
Motzek, Alexander (Universität zu Lübeck) | Möller, Ralf (Universität zu Lübeck)
Modeling causal dependencies often demands cycles at a coarse-grained temporal scale. If Bayesian networks are to be used for modeling uncertainties, cycles are eliminated with dynamic Bayesian networks, spreading indirect dependencies over time and enforcing an infinitesimal resolution of time. Without a "causal design," i.e., without anticipating indirect influences appropriately in time, we argue that such networks return spurious results. By introducing activator random variables, we propose template fragments for modeling dynamic Bayesian networks under a causal use of time, anticipating indirect influences on a solid mathematical basis, obeying the laws of Bayesian networks.