AITopics

Stochastic partition processes for exchangeable graphs produce axis-aligned blocks on a product space. In relational modeling, the resulting blocks uncover the underlying interactions between two sets of entities of the relational data. Although some flexible axis-aligned partition processes, such as the Mondrian process, have been able to capture complex interacting patterns in a hierarchical fashion, they are still in short of capturing dependence between dimensions. To overcome this limitation, we propose the Ostomachion process (OP), which relaxes the cutting direction by allowing for oblique cuts. The partitions generated by an OP are convex polygons that can capture inter-dimensional dependence. The OP also exhibits interesting properties: 1) Along the time line the cutting times can be characterized by a homogeneous Poisson process, and 2) on the partition space the areas of the resulting components comply with a Dirichlet distribution. We can thus control the expected number of cuts and the expected areas of components through hyper-parameters. We adapt the reversible-jump MCMC algorithm for inferring OP partition structures. The experimental results on relational modeling and decision tree classification have validated the merit of the OP.

artificial intelligence, machine learning, partition structure, (16 more...)

Thirtieth AAAI Conference on Artificial Intelligence

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Cuong, Nguyen Viet (National University of Singapore) | Ye, Nan (Queensland University of Technology) | Lee, Wee Sun (National University of Singapore)

Robustness of Bayesian Pool-Based Active Learning Against Prior Misspecification

We study the robustness of active learning (AL) algorithms against prior misspecification: whether an algorithm achieves similar performance using a perturbed prior as compared to using the true prior. In both the average and worst cases of the maximum coverage setting, we prove that all alpha-approximate algorithms are robust (i.e., near alpha-approximate) if the utility is Lipschitz continuous in the prior. We further show that robustness may not be achieved if the utility is non-Lipschitz. This suggests we should use a Lipschitz utility for AL if robustness is required. For the minimum cost setting, we can also obtain a robustness result for approximate AL algorithms. Our results imply that many commonly used AL algorithms are robust against perturbed priors. We then propose the use of a mixture prior to alleviate the problem of prior misspecification. We analyze the robustness of the uniform mixture prior and show experimentally that it performs reasonably well in practice.

artificial intelligence, inductive learning, machine learning, (17 more...)

Thirtieth AAAI Conference on Artificial Intelligence

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (0.66)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.46)
(2 more...)

Chen, Peixian (The Hong Kong University of Science and Technology) | Zhang, Nevin L. (The Hong Kong University of Science and Technology) | Poon, Leonard K. M. (The Hong Kong Institute of Education) | Chen, Zhourong (The Hong Kong University of Science and Technology)

Progressive EM for Latent Tree Models and Hierarchical Topic Detection

Hierarchical latent tree analysis (HLTA) is recently proposed as a new method for topic detection. It differs fundamentally from the LDA-based methods in terms of topic definition, topic-document relationship, and learning method. It has been shown to discover significantly more coherent topics and better topic hierarchies. However, HLTA relies on the Expectation-Maximization (EM) algorithm for parameter estimation and hence is not efficient enough to deal with large datasets. In this paper, we propose a method to drastically speed up HLTA using a technique inspired by the advances in the method of moments. Empirical experiments show that our method greatly improves the efficiency of HLTA. It is as efficient as the state-of-the-art LDA-based method for hierarchical topic detection and finds substantially better topics and topic hierarchies.

information retrieval, machine learning, natural language, (18 more...)

Thirtieth AAAI Conference on Artificial Intelligence

Country: Asia (0.15)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.93)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.82)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.70)
(2 more...)

Maximum Margin Dirichlet Process Mixtures for Clustering

Chen, Gang (State University of New York at Buffalo) | Zhang, Haiying (Chinese Academy of Sciences) | Xiong, Caiming (Metamind Inc.)

The Dirichlet process mixtures (DPM) can automatically infer the model complexity from data. Hence it has attracted significant attention recently, and is widely used for model selection and clustering. As a generative model, it generally requires prior base distribution to learn component parameters by maximizing posterior probability. In contrast, discriminative classifiers model the conditional probability directly, and have yielded better results than generative classifiers.In this paper, we propose a maximum margin Dirichlet process mixture for clustering, which is different from the traditional DPM for parameter modeling. Our model takes a discriminative clustering approach, by maximizing a conditional likelihood to estimate parameters. In particular, we take a EM-like algorithm by leveraging Gibbs sampling algorithm for inference, which in turn can be perfectly embedded in the online maximum margin learning procedure to update model parameters. We test our model and show comparative results over the traditional DPM and other nonparametric clustering approaches.

artificial intelligence, bayesian inference, machine learning, (18 more...)

Thirtieth AAAI Conference on Artificial Intelligence

Country: North America > United States (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.89)

Decoding Hidden Markov Models Faster Than Viterbi Via Online Matrix-Vector (max, +)-Multiplication

Cairo, Massimo (University of Trento) | Farina, Gabriele (Polytechnic University of Milan) | Rizzi, Romeo (University of Verona)

In this paper, we present a novel algorithm for the maximum a posteriori decoding (MAPD) of time-homogeneous Hidden Markov Models (HMM), improving the worst-case running time of the classical Viterbi algorithm by a logarithmic factor. In our approach, we interpret the Viterbi algorithm as a repeated computation of matrix-vector (max, +)-multiplications. On time-homogeneous HMMs, this computation is online: a matrix, known in advance, has to be multiplied with several vectors revealed one at a time. Our main contribution is an algorithm solving this version of matrix-vector (max,+)-multiplication in subquadratic time, by performing a polynomial preprocessing of the matrix. Employing this fast multiplication algorithm, we solve the MAPD problem in O(mn 2 /log n) time for any time-homogeneous HMM of size n and observation sequence of length m, with an extra polynomial preprocessing cost negligible for m > n . To the best of our knowledge, this is the first algorithm for the MAPD problem requiring subquadratic time per observation, under the assumption — usually verified in practice — that the transition probability matrix does not change with time.

algorithm, artificial intelligence, machine learning, (18 more...)

Thirtieth AAAI Conference on Artificial Intelligence

Country: Europe > Italy (0.28)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Learning Adaptive Forecasting Models from Irregularly Sampled Multivariate Clinical Data

Liu, Zitao (University of Pittsburgh) | Hauskrecht, Milos (University of Pittsburgh)

Building accurate predictive models of clinical multivariate time series is crucial for understanding of the patient condition, the dynamics of a disease, and clinical decision making. A challenging aspect of this process is that the model should be flexible and adaptive to reflect well patient-specific temporal behaviors and this also in the case when the available patient-specific data are sparse and short span. To address this problem we propose and develop an adaptive two-stage forecasting approach for modeling multivariate, irregularly sampled clinical time series of varying lengths. The proposed model (1) learns the population trend from a collection of time series for past patients; (2) captures individual-specific short-term multivariate variability; and (3) adapts by automatically adjusting its predictions based on new observations. The proposed forecasting model is evaluated on a real-world clinical time series dataset. The results demonstrate that our approach is superior on the prediction tasks for multivariate, irregularly sampled clinical time series, and it outperforms both the population based and patient-specific time series prediction models in terms of prediction accuracy.

artificial intelligence, data mining, machine learning, (17 more...)

Thirtieth AAAI Conference on Artificial Intelligence

Country: North America > United States (0.68)

Genre: Research Report (0.35)

Industry:

Health & Medicine > Therapeutic Area (0.46)
Health & Medicine > Diagnostic Medicine (0.41)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Recognizing Complex Activities by a Probabilistic Interval-Based Model

Liu, Li (National University of Singapore) | Cheng, Li (A*STAR, Singapore) | Liu, Ye (National University of Singapore) | Jia, Yongpo (National University of Singapore) | Rosenblum, David S. (National University of Singapore)

A key challenge in complex activity recognition is the fact that a complex activity can often be performed in several different ways, with each consisting of its own configuration of atomic actions and their temporal dependencies. This leads us to define an atomic activity-based probabilistic framework that employs Allen's interval relations to represent local temporal dependencies. The framework introduces a latent variable from the Chinese Restaurant Process to explicitly characterize these unique internal configurations of a particular complex activity as a variable number of tables.It can be analytically shown that the resulting interval network satisfies the transitivity property, and as a result, all local temporal dependencies can be retained and are globally consistent.Empirical evaluations on benchmark datasets suggest our approach significantly outperforms the state-of-the-art methods.

artificial intelligence, machine learning, relation, (17 more...)

Thirtieth AAAI Conference on Artificial Intelligence

Country: Asia > Singapore (0.14)

Genre: Research Report (0.34)

Industry: Consumer Products & Services > Restaurants (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)

Discriminative Nonparametric Latent Feature Relational Models with Data Augmentation

Chen, Bei (Tsinghua University) | Chen, Ning (Tsinghua University) | Zhu, Jun ( Tsinghua University ) | Song, Jiaming ( Tsinghua University ) | Zhang, Bo ( Tsinghua University )

We present a discriminative nonparametric latent feature relational model (LFRM) for link prediction to automatically infer the dimensionality of latent features. Under the generic RegBayes (regularized Bayesian inference) framework, we handily incorporate the prediction loss with probabilistic inference of a Bayesian model; set distinct regularization parameters for different types of links to handle the imbalance issue in real networks; and unify the analysis of both the smooth logistic log-loss and the piecewise linear hinge loss. For the nonconjugate posterior inference, we present a simple Gibbs sampler via data augmentation, without making restricting assumptions as done in variational methods. We further develop an approximate sampler using stochastic gradient Langevin dynamics to handle large networks with hundreds of thousands of entities and millions of links, orders of magnitude larger than what existing LFRM models can process. Extensive studies on various real networks show promising performance.

artificial intelligence, machine learning, prediction, (16 more...)

Thirtieth AAAI Conference on Artificial Intelligence

Country: Asia > China (0.14)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)

Bayesian Inference of Recursive Sequences of Group Activities from Tracks

Brau, Ernesto (Boston College) | Dawson, Colin (Oberlin College) | Carrillo, Alfredo (University of Arizona) | Sidi, David (University of Arizona) | Morrison, Clayton T. (University of Arizona)

We present a probabilistic generative model for inferring a description of coordinated, recursively structured group activities at multiple levels of temporal granularity based on observations of individuals’ trajectories. The model accommodates: (1) hierarchically structured groups, (2) activities that are temporally and compositionally recursive, (3) component roles assigning different subactivity dynamics to subgroups of participants, and (4) a nonparametric Gaussian Process model of trajectories. We present an MCMC sampling framework for performing joint inference over recursive activity descriptions and assignment of trajectories to groups, integrating out continuous parameters. We demonstrate the model’s expressive power in several simulated and complex real-world scenarios from the VIRAT and UCLA Aerial Event video data sets.

artificial intelligence, machine learning, trajectory, (20 more...)

Thirtieth AAAI Conference on Artificial Intelligence

Genre: Workflow (0.95)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.82)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling > Plan Recognition (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Scalable Training of Markov Logic Networks Using Approximate Counting

Sarkhel, Somdeb (The University of Texas at Dallas) | Venugopal, Deepak ( The University of Memphis ) | Pham, Tuan Anh (The University of Texas at Dallas) | Singla, Parag ( Indian Institute of Technology Delhi ) | Gogate, Vibhav (The University of Texas at Dallas)

In this paper, we propose principled weight learning algorithms for Markov logic networks that can easily scale to much larger datasets and application domains than existing algorithms. The main idea in our approach is to use approximate counting techniques to substantially reduce the complexity of the most computation intensive sub-step in weight learning: computing the number of groundings of a first-order formula that evaluate to true given a truth assignment to all the random variables. We derive theoretical bounds on the performance of our new algorithms and demonstrate experimentally that they are orders of magnitude faster and achieve the same accuracy or better than existing approaches.

algorithm, artificial intelligence, machine learning, (15 more...)

Thirtieth AAAI Conference on Artificial Intelligence

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.47)

Industry: Education (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)