AITopics | Undirected Networks

Collaborating Authors

Undirected Networks

News Overviews Instructional Materials AI-Alerts Classics

On Graphical Models via Univariate Exponential Family Distributions

Yang, Eunho, Ravikumar, Pradeep, Allen, Genevera I., Liu, Zhandong

arXiv.org Machine LearningSep-5-2015

Undirected graphical models, or Markov networks, are a popular class of statistical models, used in a wide variety of applications. Popular instances of this class include Gaussian graphical models and Ising models. In many settings, however, it might not be clear which subclass of graphical models to use, particularly for non-Gaussian and non-categorical data. In this paper, we consider a general sub-class of graphical models where the node-wise conditional distributions arise from exponential families. This allows us to derive multivariate graphical model distributions from univariate exponential family distributions, such as the Poisson, negative binomial, and exponential distributions. Our key contributions include a class of M-estimators to fit these graphical model distributions; and rigorous statistical analysis showing that these M-estimators recover the true graphical model structure exactly, with high probability. We provide examples of genomic and proteomic networks learned via instances of our class of graphical models derived from Poisson and exponential distributions.

artificial intelligence, graphical model, machine learning, (14 more...)

arXiv.org Machine Learning

1301.4183

Country: North America > United States > Texas (0.46)

Genre:

Research Report > New Finding (0.45)
Research Report > Experimental Study (0.45)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.68)

Technology:

Information Technology > Artificial Intelligence > Systems & Languages (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Add feedback

Particle approximations of the score and observed information matrix for parameter estimation in state space models with linear computational cost

Nemeth, Christopher, Fearnhead, Paul, Mihaylova, Lyudmila

arXiv.org Machine LearningSep-4-2015

Poyiadjis et al. (2011) show how particle methods can be used to estimate both the score and the observed information matrix for state space models. These methods either suffer from a computational cost that is quadratic in the number of particles, or produce estimates whose variance increases quadratically with the amount of data. This paper introduces an alternative approach for estimating these terms at a computational cost that is linear in the number of particles. The method is derived using a combination of kernel density estimation, to avoid the particle degeneracy that causes the quadratically increasing variance, and Rao-Blackwellisation. Crucially, we show the method is robust to the choice of bandwidth within the kernel density estimation, as it has good asymptotic properties regardless of this choice. Our estimates of the score and observed information matrix can be used within both online and batch procedures for estimating parameters for state space models. Empirical results show improved parameter estimates compared to existing methods at a significantly reduced computational cost. Supplementary materials including code are available.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Machine Learning

1306.0735

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Add feedback

Quasi-Newton particle Metropolis-Hastings

Dahlin, Johan, Lindsten, Fredrik, Schön, Thomas B.

arXiv.org Machine LearningSep-2-2015

Particle Metropolis-Hastings enables Bayesian parameter inference in general nonlinear state space models (SSMs). However, in many implementations a random walk proposal is used and this can result in poor mixing if not tuned correctly using tedious pilot runs. Therefore, we consider a new proposal inspired by quasi-Newton algorithms that may achieve similar (or better) mixing with less tuning. An advantage compared to other Hessian based proposals, is that it only requires estimates of the gradient of the log-posterior. A possible application is parameter inference in the challenging class of SSMs with intractable likelihoods. We exemplify this application and the benefits of the new proposal by modelling log-returns of future contracts on coffee by a stochastic volatility model with $\alpha$-stable observations.

artificial intelligence, machine learning, proposal, (16 more...)

arXiv.org Machine Learning

doi: 10.1016/j.ifacol.2015.12.258

1502.03656

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.49)

Add feedback

On TD(0) with function approximation: Concentration bounds and a centered variant with exponential convergence

Korda, Nathaniel, Prashanth, L. A.

arXiv.org Machine LearningSep-1-2015

We provide non-asymptotic bounds for the well-known temporal difference learning algorithm TD(0) with linear function approximators. These include high-probability bounds as well as bounds in expectation. Our analysis suggests that a step-size inversely proportional to the number of iterations cannot guarantee optimal rate of convergence unless we assume (partial) knowledge of the stationary distribution for the Markov chain underlying the policy considered. We also provide bounds for the iterate averaged TD(0) variant, which gets rid of the step-size dependency while exhibiting the optimal rate of convergence. Furthermore, we propose a variant of TD(0) with linear approximators that incorporates a centering sequence, and establish that it exhibits an exponential rate of convergence in expectation. We demonstrate the usefulness of our bounds on two synthetic experimental settings.

artificial intelligence, machine learning, reinforcement learning, (20 more...)

arXiv.org Machine Learning

1411.3224

Country: Europe > United Kingdom > England (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.41)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.35)

Add feedback

Relax but stay in control: from value to algorithms for online Markov decision processes

Guan, Peng, Raginsky, Maxim, Willett, Rebecca

arXiv.org Machine LearningAug-31-2015

Online learning algorithms are designed to perform in non-stationary environments, but generally there is no notion of a dynamic state to model constraints on current and future actions as a function of past actions. State-based models are common in stochastic control settings, but commonly used frameworks such as Markov Decision Processes (MDPs) assume a known stationary environment. In recent years, there has been a growing interest in combining the above two frameworks and considering an MDP setting in which the cost function is allowed to change arbitrarily after each time step. However, most of the work in this area has been algorithmic: given a problem, one would develop an algorithm almost from scratch. Moreover, the presence of the state and the assumption of an arbitrarily varying environment complicate both the theoretical analysis and the development of computationally efficient methods. This paper describes a broad extension of the ideas proposed by Rakhlin et al. to give a general framework for deriving algorithms in an MDP setting with arbitrarily changing costs. This framework leads to a unifying view of existing methods and provides a general procedure for constructing new ones. Several new methods are presented, and one of them is shown to have important advantages over a similar method developed from scratch via an online version of approximate dynamic programming.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Machine Learning

1310.73

Country: North America > United States > Wisconsin (0.27)

Genre:

Research Report (0.50)
Workflow (0.45)

Industry: Education (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.66)

Add feedback

Bayesian Hypothesis Testing for Block Sparse Signal Recovery

Korki, Mehdi, Zayyani, Hadi, Zhang, Jingxin

arXiv.org Machine LearningAug-22-2015

This letter presents a novel Block Bayesian Hypothesis Testing Algorithm (Block-BHTA) for reconstructing block sparse signals with unknown block structures. The Block-BHTA comprises the detection and recovery of the supports, and the estimation of the amplitudes of the block sparse signal. The support detection and recovery is performed using a Bayesian hypothesis testing. Then, based on the detected and reconstructed supports, the nonzero amplitudes are estimated by linear MMSE. The effectiveness of Block-BHTA is demonstrated by numerical experiments.

artificial intelligence, block-bhta, machine learning, (14 more...)

arXiv.org Machine Learning

1508.05495

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.71)

Add feedback

On Monotonicity of the Optimal Transmission Policy in Cross-layer Adaptive m-QAM Modulation

Ding, Ni, Sadeghi, Parastoo, Kennedy, Rodney A.

arXiv.org Machine LearningAug-20-2015

This paper considers a cross-layer adaptive modulation system that is modeled as a Markov decision process (MDP). We study how to utilize the monotonicity of the optimal transmission policy to relieve the computational complexity of dynamic programming (DP). In this system, a scheduler controls the bit rate of the m-quadrature amplitude modulation (m-QAM) in order to minimize the long-term losses incurred by the queue overflow in the data link layer and the transmission power consumption in the physical layer. The work is done in two steps. Firstly, we observe the L-natural-convexity and submodularity of DP to prove that the optimal policy is always nondecreasing in queue occupancy/state and derive the sufficient condition for it to be nondecreasing in both queue and channel states. We also show that, due to the L-natural-convexity of DP, the variation of the optimal policy in queue state is restricted by a bounded marginal effect: The increment of the optimal policy between adjacent queue states is no greater than one. Secondly, we use the monotonicity results to present two low complexity algorithms: monotonic policy iteration (MPI) based on L-natural-convexity and discrete simultaneous perturbation stochastic approximation (DSPSA). We run experiments to show that the time complexity of MPI based on L-natural-convexity is much lower than that of DP and the conventional MPI that is based on submodularity and DSPSA is able to adaptively track the optimal policy when the system parameters change.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

1508.05383

Country: North America > United States (0.46)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Add feedback

Listen, Attend and Spell

Chan, William, Jaitly, Navdeep, Le, Quoc V., Vinyals, Oriol

arXiv.org Machine LearningAug-19-2015

We present Listen, Attend and Spell (LAS), a neural network that learns to transcribe speech utterances to characters. Unlike traditional DNN-HMM models, this model learns all the components of a speech recognizer jointly. Our system has two components: a listener and a speller. The listener is a pyramidal recurrent network encoder that accepts filter bank spectra as inputs. The speller is an attention-based recurrent network decoder that emits characters as outputs. The network produces character sequences without making any independence assumptions between the characters. This is the key improvement of LAS over previous end-to-end CTC models. On a subset of the Google voice search task, LAS achieves a word error rate (WER) of 14.1% without a dictionary or a language model, and 10.3% with language model rescoring over the top 32 beams. By comparison, the state-of-the-art CLDNN-HMM model achieves a WER of 8.0%.

artificial intelligence, machine learning, natural language, (14 more...)

arXiv.org Machine Learning

1508.01211

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

A Dictionary Learning Approach for Factorial Gaussian Models

Subakan, Y. Cem, Traa, Johannes, Smaragdis, Paris, Stein, Noah

arXiv.org Machine LearningAug-18-2015

In this paper, we develop a parameter estimation method for factorially parametrized models such as Factorial Gaussian Mixture Model and Factorial Hidden Markov Model. Our contributions are two-fold. First, we show that the emission matrix of the standard Factorial Model is unidentifiable even if the true assignment matrix is known. Secondly, we address the issue of identifiability by making a one component sharing assumption and derive a parameter learning algorithm for this case. Our approach is based on a dictionary learning problem of the form $X = O R$, where the goal is to learn the dictionary $O$ given the data matrix $X$. We argue that due to the specific structure of the activation matrix $R$ in the shared component factorial mixture model, and an incoherence assumption on the shared component, it is possible to extract the columns of the $O$ matrix without the need for alternating between the estimation of $O$ and $R$.

artificial intelligence, machine learning, matrix, (18 more...)

arXiv.org Machine Learning

1508.04486

Country: North America > United States (0.69)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Partial Optimality by Pruning for MAP-Inference with General Graphical Models

Swoboda, Paul, Shekhovtsov, Alexander, Kappes, Jörg Hendrik, Schnörr, Christoph, Savchynskyy, Bogdan

arXiv.org Artificial IntelligenceAug-18-2015

We consider the energy minimization problem for undirected graphical models, also known as MAP-inference problem for Markov random fields which is NP-hard in general. We propose a novel polynomial time algorithm to obtain a part of its optimal non-relaxed integral solution. Our algorithm is initialized with variables taking integral values in the solution of a convex relaxation of the MAP-inference problem and iteratively prunes those, which do not satisfy our criterion for partial optimality. We show that our pruning strategy is in a certain sense theoretically optimal. Also empirically our method outperforms previous approaches in terms of the number of persistently labelled variables. The method is very general, as it is applicable to models with arbitrary factors of an arbitrary order and can employ any solver for the considered relaxed problem. Our method's runtime is determined by the runtime of the convex relaxation solver for the MAP-inference problem.

algorithm 1, artificial intelligence, machine learning, (15 more...)

arXiv.org Artificial Intelligence

1410.6641

Country:

North America > United States (0.46)
Europe (0.46)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback