AITopics | Directed Networks

Collaborating Authors

Directed Networks

News Overviews Instructional Materials AI-Alerts Classics

Scalable Bayesian Non-Negative Tensor Factorization for Massive Count Data

Hu, Changwei, Rai, Piyush, Chen, Changyou, Harding, Matthew, Carin, Lawrence

arXiv.org Machine LearningAug-18-2015

We present a Bayesian non-negative tensor factorization model for count-valued tensor data, and develop scalable inference algorithms (both batch and online) for dealing with massive tensors. Our generative model can handle overdispersed counts as well as infer the rank of the decomposition. Moreover, leveraging a reparameterization of the Poisson distribution as a multinomial facilitates conjugacy in the model and enables simple and efficient Gibbs sampling and variational Bayes (VB) inference updates, with a computational cost that only depends on the number of nonzeros in the tensor. The model also provides a nice interpretability for the factors; in our model, each factor corresponds to a "topic". We develop a set of online inference algorithms that allow further scaling up the model to massive tensors, for which batch inference methods may be infeasible. We apply our framework on diverse real-world applications, such as \emph{multiway} topic modeling on a scientific publications database, analyzing a political science data set, and analyzing a massive household transactions data set.

artificial intelligence, machine learning, tensor, (12 more...)

arXiv.org Machine Learning

1508.04211

Country:

North America > United States (1.00)
Africa > Middle East > Libya (0.14)

Genre: Research Report (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Food & Agriculture (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.89)

Add feedback

Zero-Truncated Poisson Tensor Factorization for Massive Binary Tensors

Hu, Changwei, Rai, Piyush, Carin, Lawrence

arXiv.org Machine LearningAug-18-2015

We present a scalable Bayesian model for low-rank factorization of massive tensors with binary observations. The proposed model has the following key properties: (1) in contrast to the models based on the logistic or probit likelihood, using a zero-truncated Poisson likelihood for binary data allows our model to scale up in the number of \emph{ones} in the tensor, which is especially appealing for massive but sparse binary tensors; (2) side-information in form of binary pairwise relationships (e.g., an adjacency network) between objects in any tensor mode can also be leveraged, which can be especially useful in "cold-start" settings; and (3) the model admits simple Bayesian inference via batch, as well as \emph{online} MCMC; the latter allows scaling up even for \emph{dense} binary data (i.e., when the number of ones in the tensor/network is also massive). In addition, non-negative factor matrices in our model provide easy interpretability, and the tensor rank can be inferred from the data. We evaluate our model on several large-scale real-world binary tensors, achieving excellent computational scalability, and also demonstrate its usefulness in leveraging side-information provided in form of mode-network(s).

artificial intelligence, machine learning, tensor, (17 more...)

arXiv.org Machine Learning

1508.0421

Genre: Research Report > New Finding (0.93)

Industry: Health & Medicine (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

Add feedback

Tree-Width and the Computational Complexity of MAP Approximations in Bayesian Networks

Kwisthout, Johan

Journal of Artificial Intelligence ResearchAug-17-2015

The problem of finding the most probable explanation to a designated set of variables given partial evidence (the MAP problem) is a notoriously intractable problem in Bayesian networks, both to compute exactly and to approximate. It is known, both from theoretical considerations and from practical experience, that low tree-width is typically an essential prerequisite to efficient exact computations in Bayesian networks. In this paper we investigate whether the same holds for approximating MAP. We define four notions of approximating MAP (by value, structure, rank, and expectation) and argue that all of them are intractable in general. We prove that efficient value-approximations, structure-approximations, and rank-approximations of MAP instances with high tree-width will violate the Exponential Time Hypothesis. In contrast, we show that MAP can sometimes be efficiently expectation-approximated, even in instances with high tree-width, if the most probable explanation has a high probability. We introduce the complexity class FERT, analogous to the class FTP, to capture this notion of fixed-parameter expectation-approximability. We suggest a road-map to future research that yields fixed-parameter tractable results for expectation-approximate MAP, even in graphs with high tree-width.

artificial intelligence, bayesian inference, machine learning, (19 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.4794

AI Access Foundation

10953

Journal of Artificial Intelligence Research

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Netherlands > Gelderland > Nijmegen (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Causal Decision Trees

Li, Jiuyong, Ma, Saisai, Le, Thuc Duy, Liu, Lin, Liu, Jixue

arXiv.org Artificial IntelligenceAug-16-2015

Uncovering causal relationships in data is a major objective of data analytics. Causal relationships are normally discovered with designed experiments, e.g. randomised controlled trials, which, however are expensive or infeasible to be conducted in many cases. Causal relationships can also be found using some well designed observational studies, but they require domain experts' knowledge and the process is normally time consuming. Hence there is a need for scalable and automated methods for causal relationship exploration in data. Classification methods are fast and they could be practical substitutes for finding causal signals in data. However, classification methods are not designed for causal discovery and a classification method may find false causal signals and miss the true ones. In this paper, we develop a causal decision tree where nodes have causal interpretations. Our method follows a well established causal inference framework and makes use of a classic statistical test. The method is practical for finding causal signals in large data sets.

artificial intelligence, causal relationship, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TKDE.2016.2619350

1508.03812

Country: North America > United States (0.28)

Genre:

Research Report > Strength High (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Consumer Health (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.75)

Add feedback

Unbounded Bayesian Optimization via Regularization

Shahriari, Bobak, Bouchard-Côté, Alexandre, de Freitas, Nando

arXiv.org Machine LearningAug-14-2015

Bayesian optimization has recently emerged as a popular and efficient tool for global optimization and hyperparameter tuning. Currently, the established Bayesian optimization practice requires a user-defined bounding box which is assumed to contain the optimizer. However, when little is known about the probed objective function, it can be difficult to prescribe such bounds. In this work we modify the standard Bayesian optimization framework in a principled way to allow automatic resizing of the search space. We introduce two alternative methods and compare them on two common synthetic benchmarking test functions as well as the tasks of tuning the stochastic gradient descent optimizer of a multi-layered perceptron and a convolutional neural network on MNIST.

artificial intelligence, machine learning, optimization, (18 more...)

arXiv.org Machine Learning

1508.03666

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)
(2 more...)

Add feedback

Neyman-Pearson Classification under High-Dimensional Settings

Zhao, Anqi, Feng, Yang, Wang, Lie, Tong, Xin

arXiv.org Machine LearningAug-14-2015

Most existing binary classification methods target on the optimization of the overall classification risk and may fail to serve some real-world applications such as cancer diagnosis, where users are more concerned with the risk of misclassifying one specific class than the other. Neyman-Pearson (NP) paradigm was introduced in this context as a novel statistical framework for handling asymmetric type I/II error priorities. It seeks classifiers with a minimal type II error and a constrained type I error under a user specified level. This article is the first attempt to construct classifiers with guaranteed theoretical performance under the NP paradigm in high-dimensional settings. Based on the fundamental Neyman-Pearson Lemma, we used a plug-in approach to construct NP-type classifiers for Naive Bayes models. The proposed classifiers satisfy the NP oracle inequalities, which are natural NP paradigm counterparts of the oracle inequalities in classical binary classification. Besides their desirable theoretical properties, we also demonstrated their numerical advantages in prioritized error control via both simulation and real data studies.

artificial intelligence, classifier, machine learning, (18 more...)

arXiv.org Machine Learning

1508.03106

Country: North America > United States (0.92)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)

Add feedback

A model selection approach for clustering a multinomial sequence with non-negative factorization

Lee, Nam H., Tang, Runze, Priebe, Carey E., Rosen, Michael

arXiv.org Machine LearningAug-14-2015

We consider a problem of clustering a sequence of multinomial observations by way of a model selection criterion. We propose a form of a penalty term for the model selection procedure. Our approach subsumes both the conventional AIC and BIC criteria but also extends the conventional criteria in a way that it can be applicable also to a sequence of sparse multinomial observations, where even within a same cluster, the number of multinomial trials may be different for different observations. In addition, as a preliminary estimation step to maximum likelihood estimation, and more generally, to maximum $L_{q}$ estimation, we propose to use reduced rank projection in combination with non-negative factorization. We motivate our approach by showing that our model selection criterion and preliminary estimation step yield consistent estimates under simplifying assumptions. We also illustrate our approach through numerical experiments using real and simulated data.

artificial intelligence, bayesian inference, machine learning, (19 more...)

arXiv.org Machine Learning

1312.7559

Country: North America > United States (0.28)

Genre: Research Report (0.82)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

RCR: Robust Compound Regression for Robust Estimation of Errors-in-Variables Model

Han, Hao, Zhu, Wei

arXiv.org Machine LearningAug-12-2015

The errors-in-variables (EIV) regression model, being more realistic by accounting for measurement errors in both the dependent and the independent variables, is widely adopted in applied sciences. The traditional EIV model estimators, however, can be highly biased by outliers and other departures from the underlying assumptions. In this paper, we develop a novel nonparametric regression approach - the robust compound regression (RCR) analysis method for the robust estimation of EIV models. We first introduce a robust and efficient estimator called least sine squares (LSS). Taking full advantage of both the new LSS method and the compound regression analysis method developed in our own group, we subsequently propose the RCR approach as a generalization of those two, which provides a robust counterpart of the entire class of the maximum likelihood estimation (MLE) solutions of the EIV model, in a 1-1 mapping. Technically, our approach gives users the flexibility to select from a class of RCR estimates the optimal one with a predefined regression efficiency criterion satisfied. Simulation studies and real-life examples are provided to illustrate the effectiveness of the RCR approach.

artificial intelligence, estimator, machine learning, (14 more...)

arXiv.org Machine Learning

1508.02925

Country: North America > United States > New York (0.15)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

Bayesian Dropout

Herlau, Tue, Mørup, Morten, Schmidt, Mikkel N.

arXiv.org Machine LearningAug-12-2015

Dropout has recently emerged as a powerful and simple method for training neural networks preventing co-adaptation by stochastically omitting neurons. Dropout is currently not grounded in explicit modelling assumptions which so far has precluded its adoption in Bayesian modelling. Using Bayesian entropic reasoning we show that dropout can be interpreted as optimal inference under constraints. We demonstrate this on an analytically tractable regression model providing a Bayesian interpretation of its mechanism for regularizing and preventing co-adaptation as well as its connection to other Bayesian techniques. We also discuss two general approximate techniques for applying Bayesian dropout for general models, one based on an analytical approximation and the other on stochastic variational techniques. These techniques are then applied to a Baysian logistic regression problem and are shown to improve performance as the model become more misspecified. Our framework roots dropout as a theoretically justified and practical tool for statistical modelling allowing Bayesians to tap into the benefits of dropout training.

artificial intelligence, dropout, machine learning, (14 more...)

arXiv.org Machine Learning

1508.02905

Country: Europe (0.68)

Genre: Research Report > New Finding (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

Add feedback

Alternating Minimization Algorithm with Automatic Relevance Determination for Transmission Tomography under Poisson Noise

Kaganovsky, Yan, Han, Shaobo, Degirmenci, Soysal, Politte, David G., Brady, David J., O'Sullivan, Joseph A., Carin, Lawrence

arXiv.org Machine LearningAug-11-2015

We propose a globally convergent alternating minimization (AM) algorithm for image reconstruction in transmission tomography, which extends automatic relevance determination (ARD) to Poisson noise models with Beer's law. The algorithm promotes solutions that are sparse in the pixel/voxel-differences domain by introducing additional latent variables, one for each pixel/voxel, and then learning these variables from the data using a hierarchical Bayesian model. Importantly, the proposed AM algorithm is free of any tuning parameters with image quality comparable to standard penalized likelihood methods. Our algorithm exploits optimization transfer principles which reduce the problem into parallel 1D optimization tasks (one for each pixel/voxel), making the algorithm feasible for large-scale problems. This approach considerably reduces the computational bottleneck of ARD associated with the posterior variances. Positivity constraints inherent in transmission tomography problems are also enforced. We demonstrate the performance of the proposed algorithm for x-ray computed tomography using synthetic and real-world datasets. The algorithm is shown to have much better performance than prior ARD algorithms based on approximate Gaussian noise models, even for high photon flux.

algorithm, optimization problem, upstream oil & gas, (19 more...)

arXiv.org Machine Learning

1412.8464

Country:

North America > United States (0.92)
Asia > Middle East (0.14)

Genre: Research Report (1.00)

Industry:

Energy > Oil & Gas > Upstream (0.91)
Health & Medicine > Diagnostic Medicine > Imaging (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)

Add feedback