AITopics | Directed Networks

Collaborating Authors

Directed Networks

News Overviews Instructional Materials AI-Alerts Classics

Exploring Algorithmic Limits of Matrix Rank Minimization under Affine Constraints

arXiv.org Machine LearningJul-1-2015

Many applications require recovering a matrix of minimal rank within an affine constraint set, with matrix completion a notable special case. Because the problem is NP-hard in general, it is common to replace the matrix rank with the nuclear norm, which acts as a convenient convex surrogate. While elegant theoretical conditions elucidate when this replacement is likely to be successful, they are highly restrictive and convex algorithms fail when the ambient rank is too high or when the constraint set is poorly structured. Non-convex alternatives fare somewhat better when carefully tuned; however, convergence to locally optimal solutions remains a continuing source of failure. Against this backdrop we derive a deceptively simple and parameter-free probabilistic PCA-like algorithm that is capable, over a wide battery of empirical tests, of successful recovery even at the theoretical limit where the number of measurements equal the degrees of freedom in the unknown low-rank matrix. Somewhat surprisingly, this is possible even when the affine constraint set is highly ill-conditioned. While proving general recovery guarantees remains evasive for non-convex algorithms, Bayesian-inspired or otherwise, we nonetheless show conditions whereby the underlying cost function has a unique stationary point located at the global optimum; no existing cost function we are aware of satisfies this same property. We conclude with a simple computer vision application involving image rectification and a standard collaborative filtering benchmark.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Machine Learning

1406.2504

Country: North America (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.87)

Add feedback

Selective Inference and Learning Mixed Graphical Models

Lee, Jason D.

arXiv.org Machine LearningJun-30-2015

This thesis studies two problems in modern statistics. First, we study selective inference, or inference for hypothesis that are chosen after looking at the data. The motiving application is inference for regression coefficients selected by the lasso. We present the Condition-on-Selection method that allows for valid selective inference, and study its application to the lasso, and several other selection algorithms. In the second part, we consider the problem of learning the structure of a pairwise graphical model over continuous and discrete variables. We present a new pairwise model for graphical models with both continuous and discrete variables that is amenable to structure learning. In previous work, authors have considered structure learning of Gaussian graphical models and structure learning of discrete models. Our approach is a natural generalization of these two lines of work to the mixed case. The penalization scheme involves a novel symmetric use of the group-lasso norm and follows naturally from a particular parametrization of the model. We provide conditions under which our estimator is model selection consistent in the high-dimensional regime.

artificial intelligence, confidence interval, machine learning, (16 more...)

arXiv.org Machine Learning

1507.00039

Country: North America > United States (0.45)

Genre:

Research Report > New Finding (0.93)
Research Report > Experimental Study (0.68)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Add feedback

On the Equivalence of Factorized Information Criterion Regularization and the Chinese Restaurant Process Prior

Li, Shaohua

arXiv.org Machine LearningJun-30-2015

Factorized Information Criterion (FIC) is a recently developed information criterion, based on which a novel model selection methodology, namely Factorized Asymptotic Bayesian (FAB) Inference, has been developed and successfully applied to various hierarchical Bayesian models. The Dirichlet Process (DP) prior, and one of its well known representations, the Chinese Restaurant Process (CRP), derive another line of model selection methods. FIC can be viewed as a prior distribution over the latent variable configurations. Under this view, we prove that when the parameter dimensionality $D_{c}=2$, FIC is equivalent to CRP. We argue that when $D_{c}>2$, FIC avoids an inherent problem of DP/CRP, i.e. the data likelihood will dominate the impact of the prior, and thus the model selection capability will weaken as $D_{c}$ increases. However, FIC overestimates the data likelihood. As a result, FIC may be overly biased towards models with less components. We propose a natural generalization of FIC, which finds a middle ground between CRP and FIC, and may yield more accurate model selection results than FIC.

artificial intelligence, information criterion regularization, machine learning, (15 more...)

arXiv.org Machine Learning

1506.09068

Genre: Research Report (0.71)

Industry: Consumer Products & Services > Restaurants (0.63)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.51)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.36)

Add feedback

Divide-and-Conquer with Sequential Monte Carlo

Lindsten, Fredrik, Johansen, Adam M., Naesseth, Christian A., Kirkpatrick, Bonnie, Schön, Thomas B., Aston, John, Bouchard-Côté, Alexandre

arXiv.org Machine LearningJun-30-2015

We propose a novel class of Sequential Monte Carlo (SMC) algorithms, appropriate for inference in probabilistic graphical models. This class of algorithms adopts a divide-and-conquer approach based upon an auxiliary tree-structured decomposition of the model of interest, turning the overall inferential task into a collection of recursively solved sub-problems. The proposed method is applicable to a broad class of probabilistic graphical models, including models with loops. Unlike a standard SMC sampler, the proposed Divide-and-Conquer SMC employs multiple independent populations of weighted particles, which are resampled, merged, and propagated as the method progresses. We illustrate empirically that this approach can outperform standard methods in terms of the accuracy of the posterior expectation and marginal likelihood approximations. Divide-and-Conquer SMC also opens up novel parallel implementation options and the possibility of concentrating the computational effort on the most challenging sub-problems. We demonstrate its performance on a Markov random field and on a hierarchical logistic regression problem.

artificial intelligence, bayesian inference, machine learning, (19 more...)

arXiv.org Machine Learning

doi: 10.1080/10618600.2016.1237363

1406.4993

Country:

Europe (0.67)
North America > United States > New York (0.46)

Genre: Research Report > New Finding (0.66)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.88)
(2 more...)

Add feedback

Probabilistic Inference Techniques for Scalable Multiagent Decision Making

Kumar, Akshat, Zilberstein, Shlomo, Toussaint, Marc

Journal of Artificial Intelligence ResearchJun-29-2015

Decentralized POMDPs provide an expressive framework for multiagent sequential decision making. However, the complexity of these models---NEXP-Complete even for two agents---has limited their scalability. We present a promising new class of approximation algorithms by developing novel connections between multiagent planning and machine learning. We show how the multiagent planning problem can be reformulated as inference in a mixture of dynamic Bayesian networks (DBNs). This planning-as-inference approach paves the way for the application of efficient inference techniques in DBNs to multiagent decision making. To further improve scalability, we identify certain conditions that are sufficient to extend the approach to multiagent systems with dozens of agents. Specifically, we show that the necessary inference within the expectation-maximization framework can be decomposed into processes that often involve a small subset of agents, thereby facilitating scalability. We further show that a number of existing multiagent planning models satisfy these conditions. Experiments on large planning benchmarks confirm the benefits of our approach in terms of runtime and scalability with respect to existing techniques.

agent, algorithm, zilberstein, (15 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.4649

AI Access Foundation

10944

Journal of Artificial Intelligence Research

Country:

Asia > Middle East > Jordan (0.04)
Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)
Asia > Singapore (0.04)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Non-Normal Mixtures of Experts

Chamroukhi, Faicel

arXiv.org Machine LearningJun-28-2015

Mixture of Experts (MoE) is a popular framework for modeling heterogeneity in data for regression, classification and clustering. For continuous data which we consider here in the context of regression and cluster analysis, MoE usually use normal experts, that is, expert components following the Gaussian distribution. However, for a set of data containing a group or groups of observations with asymmetric behavior, heavy tails or atypical observations, the use of normal experts may be unsuitable and can unduly affect the fit of the MoE model. In this paper, we introduce new non-normal mixture of experts (NNMoE) which can deal with these issues regarding possibly skewed, heavy-tailed data and with outliers. The proposed models are the skew-normal MoE and the robust $t$ MoE and skew $t$ MoE, respectively named SNMoE, TMoE and STMoE. We develop dedicated expectation-maximization (EM) and expectation conditional maximization (ECM) algorithms to estimate the parameters of the proposed models by monotonically maximizing the observed data log-likelihood. We describe how the presented models can be used in prediction and in model-based clustering of regression data. Numerical experiments carried out on simulated data show the effectiveness and the robustness of the proposed models in terms modeling non-linear regression functions as well as in model-based clustering. Then, to show their usefulness for practical applications, the proposed models are applied to the real-world data of tone perception for musical data analysis, and the one of temperature anomalies for the analysis of climate change data.

artificial intelligence, bayesian inference, machine learning, (13 more...)

arXiv.org Machine Learning

1506.06707

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.93)

Industry: Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Nonparametric Estimation of Band-limited Probability Density Functions

Agarwal, Rahul, Chen, Zhe, Sarma, Sridevi V.

arXiv.org Machine LearningJun-28-2015

In this paper, a nonparametric maximum likelihood (ML) estimator for band-limited (BL) probability density functions (pdfs) is proposed. The BLML estimator is consistent and computationally efficient. To compute the BLML estimator, three approximate algorithms are presented: a binary quadratic programming (BQP) algorithm for medium scale problems, a Trivial algorithm for large-scale problems that yields a consistent estimate if the underlying pdf is strictly positive and BL, and a fast implementation of the Trivial algorithm that exploits the band-limited assumption and the Nyquist sampling theorem ("BLMLQuick"). All three BLML estimators outperform kernel density estimation (KDE) algorithms (adaptive and higher order KDEs) with respect to the mean integrated squared error for data generated from both BL and infinite-band pdfs. Further, the BLMLQuick estimate is remarkably faster than the KD algorithms. Finally, the BLML method is applied to estimate the conditional intensity function of a neuronal spike train (point process) recorded from a rat's entorhinal cortex grid cell, for which it outperforms state-of-the-art estimators used in neuroscience.

artificial intelligence, estimator, machine learning, (17 more...)

arXiv.org Machine Learning

1503.06236

Country: North America > United States (0.46)

Genre: Research Report > Experimental Study (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

Modelling of directional data using Kent distributions

Kasarapu, Parthan

arXiv.org Machine LearningJun-26-2015

The modelling of data on a spherical surface requires the consideration of directional probability distributions. To model asymmetrically distributed data on a three-dimensional sphere, Kent distributions are often used. The moment estimates of the parameters are typically used in modelling tasks involving Kent distributions. However, these lack a rigorous statistical treatment. The focus of the paper is to introduce a Bayesian estimation of the parameters of the Kent distribution which has not been carried out in the literature, partly because of its complex mathematical form. We employ the Bayesian information-theoretic paradigm of Minimum Message Length (MML) to bridge this gap and derive reliable estimators. The inferred parameters are subsequently used in mixture modelling of Kent distributions. The problem of inferring the suitable number of mixture components is also addressed using the MML criterion. We demonstrate the superior performance of the derived MML-based parameter estimates against the traditional estimators. We apply the MML principle to infer mixtures of Kent distributions to model empirical data corresponding to protein conformations. We demonstrate the effectiveness of Kent models to act as improved descriptors of protein structural data as compared to commonly used von Mises-Fisher distributions.

artificial intelligence, fb 5, machine learning, (18 more...)

arXiv.org Machine Learning

1506.08105

Country: North America > United States (0.93)

Genre: Research Report > Experimental Study (0.46)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

An Empirical Study of Stochastic Variational Algorithms for the Beta Bernoulli Process

Shah, Amar, Knowles, David A., Ghahramani, Zoubin

arXiv.org Machine LearningJun-26-2015

Stochastic variational inference (SVI) is emerging as the most promising candidate for scaling inference in Bayesian probabilistic models to large datasets. However, the performance of these methods has been assessed primarily in the context of Bayesian topic models, particularly latent Dirichlet allocation (LDA). Deriving several new algorithms, and using synthetic, image and genomic datasets, we investigate whether the understanding gleaned from LDA applies in the setting of sparse latent factor models, specifically beta process factor analysis (BPFA). We demonstrate that the big picture is consistent: using Gibbs sampling within SVI to maintain certain posterior dependencies is extremely effective. However, we find that different posterior dependencies are important in BPFA relative to LDA. Particularly, approximations able to model intra-local variable dependence perform best.

approximation, machine learning, natural language, (15 more...)

arXiv.org Machine Learning

1506.0818

Country:

North America > United States (0.46)
Europe (0.46)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.89)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)

Add feedback

Fairness-Aware Learning with Restriction of Universal Dependency using f-Divergences

Fukuchi, Kazuto, Sakuma, Jun

arXiv.org Machine LearningJun-25-2015

Fairness-aware learning is a novel framework for classification tasks. Like regular empirical risk minimization (ERM), it aims to learn a classifier with a low error rate, and at the same time, for the predictions of the classifier to be independent of sensitive features, such as gender, religion, race, and ethnicity. Existing methods can achieve low dependencies on given samples, but this is not guaranteed on unseen samples. The existing fairness-aware learning algorithms employ different dependency measures, and each algorithm is specifically designed for a particular one. Such diversity makes it difficult to theoretically analyze and compare them. In this paper, we propose a general framework for fairness-aware learning that uses f-divergences and that covers most of the dependency measures employed in the existing methods. We introduce a way to estimate the f-divergences that allows us to give a unified analysis for the upper bound of the estimation error; this bound is tighter than that of the existing convergence rate analysis of the divergence estimation. With our divergence estimate, we propose a fairness-aware learning algorithm, and perform a theoretical analysis of its generalization error. Our analysis reveals that, under mild assumptions and even with enforcement of fairness, the generalization error of our method is $O(\sqrt{1/n})$, which is the same as that of the regular ERM. In addition, and more importantly, we show that, for any f-divergence, the upper bound of the estimation error of the divergence is $O(\sqrt{1/n})$. This indicates that our fairness-aware learning algorithm guarantees low dependencies on unseen samples for any dependency measure represented by an f-divergence.

artificial intelligence, dependency, machine learning, (15 more...)

arXiv.org Machine Learning

1506.07721

Country: Asia (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.66)

Add feedback