AITopics | Learning Graphical Models

Collaborating Authors

Learning Graphical Models

A graphical model or probabilistic graphical model (PGM) or structured probabilistic model is a probabilistic model for which a graph expresses the conditional dependence structure between random variables. They are commonly used in probability theory, statistics—particularly Bayesian statistics—and machine learning. (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Pseudo-likelihood methods for community detection in large sparse networks

Amini, Arash A., Chen, Aiyou, Bickel, Peter J., Levina, Elizaveta

arXiv.org Machine LearningNov-5-2013

Many algorithms have been proposed for fitting network models with communities, but most of them do not scale well to large networks, and often fail on sparse networks. Here we propose a new fast pseudo-likelihood method for fitting the stochastic block model for networks, as well as a variant that allows for an arbitrary degree distribution by conditioning on degrees. We show that the algorithms perform well under a range of settings, including on very sparse networks, and illustrate on the example of a network of political blogs. We also propose spectral clustering with perturbations, a method of independent interest, which works well on sparse networks where regular spectral clustering fails, and use it to provide an initial value for pseudo-likelihood. We prove that pseudo-likelihood provides consistent estimates of the communities under a mild condition on the starting value, for the case of a block model with two communities.

block model, data mining, machine learning, (20 more...)

arXiv.org Machine Learning

doi: 10.1214/13-AOS1138

1207.234

Country:

North America > United States > California (0.46)
North America > United States > Michigan (0.28)

Genre: Research Report (0.82)

Industry: Government (0.68)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.67)
(2 more...)

Add feedback

Statistical Inference in Hidden Markov Models using $k$-segment Constraints

Titsias, Michalis K., Yau, Christopher, Holmes, Christopher C.

arXiv.org Machine LearningNov-5-2013

Fundamentally, the HMM is a mixture model whose mixing distribution is a finite state Markov chain (Rabiner, 1989; Capp e et al., 2005). Whilst the Markov assumptions rarely correspond to the true physical generative process, it often adequately captures first-order properties that make it a useful approximating model for sequence data in many instances whilst remaining tractable even for very large datasets. As a consequence, HMM-based algorithms can give highly competitive performance in many applications. Central to the tractability of HMMs is the availability of recursive algorithms that allow fundamental quantities to be computed efficiently (Baum and Petrie, 1966; Viterbi, 1967). These include the Viterbi algorithm which computes the most probable hidden state sequence and the forward-backward algorithm which computes the marginal probability of a given state at a point in the sequence. Computation for the HMM has been well-summarized in the comprehensive and widely read tutorial by Rabiner (1989) with a Bayesian treatment given more recently by Scott (2002). It is a testament to the completeness of these recursive methods that there have been few generic additions to the HMM toolbox since these were first described in the 1960s. However, as HMM approaches continue to be applied in increasingly diverse scientific domains and ever larger data sets, there is interest in expanding the generic toolbox available for HMM inference to encompass unmet needs. The motivation for our work is to develop mechanisms to allow theexploration of the posterior sequence space.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

doi: 10.1080/01621459.2014.998762

1311.1189

Country:

North America > United States (0.46)
Europe > United Kingdom (0.28)

Genre: Research Report (0.63)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Thompson Sampling for Complex Bandit Problems

Gopalan, Aditya, Mannor, Shie, Mansour, Yishay

arXiv.org Machine LearningNov-3-2013

We consider stochastic multi-armed bandit problems with complex actions over a set of basic arms, where the decision maker plays a complex action rather than a basic arm in each round. The reward of the complex action is some function of the basic arms' rewards, and the feedback observed may not necessarily be the reward per-arm. For instance, when the complex actions are subsets of the arms, we may only observe the maximum reward over the chosen subset. Thus, feedback across complex actions may be coupled due to the nature of the reward function. We prove a frequentist regret bound for Thompson sampling in a very general setting involving parameter, action and observation spaces and a likelihood function over them. The bound holds for discretely-supported priors over the parameter space and without additional structural properties such as closed-form posteriors, conjugate prior structure or independence across arms. The regret bound scales logarithmically with time but, more importantly, with an improved constant that non-trivially captures the coupling across complex actions due to the structure of the rewards. As applications, we derive improved regret bounds for classes of complex bandit problems involving selecting subsets of arms, including the first nontrivial regret bounds for nonlinear MAX reward feedback from subsets.

data mining, machine learning, thompson, (21 more...)

arXiv.org Machine Learning

1311.0466

Country: Asia > Middle East > Israel (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback

Parsimonious Shifted Asymmetric Laplace Mixtures

Franczak, Brian C., McNicholas, Paul D., Browne, Ryan P., Murray, Paula M.

arXiv.org Machine LearningNov-1-2013

A family of parsimonious shifted asymmetric Laplace mixture models is introduced. We extend the mixture of factor analyzers model to the shifted asymmetric Laplace distribution. Imposing constraints on the constitute parts of the resulting decomposed component scale matrices leads to a family of parsimonious models. An explicit two-stage parameter estimation procedure is described, and the Bayesian information criterion and the integrated completed likelihood are compared for model selection. This novel family of models is applied to real data, where it is compared to its Gaussian analogue within clustering and classification paradigms.

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Machine Learning

1311.0317

Country:

Europe (0.67)
North America > Canada > Ontario (0.46)
North America > United States > California (0.28)

Genre: Research Report (0.82)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)

Add feedback

Dynamic Clustering via Asymptotics of the Dependent Dirichlet Process Mixture

Campbell, Trevor, Liu, Miao, Kulis, Brian, How, Jonathan P., Carin, Lawrence

arXiv.org Machine LearningNov-1-2013

This paper presents a novel algorithm, based upon the dependent Dirichlet process mixture model (DDPMM), for clustering batch-sequential data containing an unknown number of evolving clusters. The algorithm is derived via a low-variance asymptotic analysis of the Gibbs sampling algorithm for the DDPMM, and provides a hard clustering with convergence guarantees similar to those of the k-means algorithm. Empirical results from a synthetic test with moving Gaussian clusters and a test with real ADS-B aircraft trajectory data demonstrate that the algorithm requires orders of magnitude less computational time than contemporary probabilistic and hard clustering algorithms, while providing higher accuracy on the examined datasets.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

1305.6659

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.37)

Add feedback

A dependent partition-valued process for multitask clustering and time evolving network modelling

Palla, Konstantina, Knowles, David A., Ghahramani, Zoubin

arXiv.org Machine LearningOct-31-2013

The fundamental aim of clustering algorithms is to partition data points. We consider tasks where the discovered partition is allowed to vary with some covariate such as space or time. One approach would be to use fragmentation-coagulation processes, but these, being Markov processes, are restricted to linear or tree structured covariate spaces. We define a partition-valued process on an arbitrary covariate space using Gaussian processes. We use the process to construct a multitask clustering model which partitions datapoints in a similar way across multiple data sources, and a time series model of network data which allows cluster assignments to vary over time. We describe sampling algorithms for inference and apply our method to defining cancer subtypes based on different types of cellular characteristics, finding regulatory modules from gene expression data from multiple human populations, and discovering time varying community structure in a social network.

artificial intelligence, data source, machine learning, (19 more...)

arXiv.org Machine Learning

1303.3265

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report (0.40)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

A Global Model for Concept-to-Text Generation

Konstas, I., Lapata, M.

Journal of Artificial Intelligence ResearchOct-30-2013

Concept-to-text generation refers to the task of automatically producing textual output from non-linguistic input. We present a joint model that captures content selection ("what to say") and surface realization ("how to say") in an unsupervised domain-independent fashion. Rather than breaking up the generation process into a sequence of local decisions, we define a probabilistic context-free grammar that globally describes the inherent structure of the input (a corpus of database records and text describing some of them). We recast generation as the task of finding the best derivation tree for a set of database records and describe an algorithm for decoding in this framework that allows to intersect the grammar with additional information capturing fluency and syntactic well-formedness constraints. Experimental evaluation on several domains achieves results competitive with state-of-the-art systems that use domain specific constraints, explicit feature engineering or labeled data.

grammar, language model, proceedings, (16 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.4025

AI Access Foundation

10841

Journal of Artificial Intelligence Research

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Wisconsin > Milwaukee County > Milwaukee (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
(22 more...)

Genre: Research Report > New Finding (0.68)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Add feedback

Automatic Classification of Variable Stars in Catalogs with missing data

Pichara, Karim, Protopapas, Pavlos

arXiv.org Machine LearningOct-29-2013

We present an automatic classification method for astronomical catalogs with missing data. We use Bayesian networks, a probabilistic graphical model, that allows us to perform inference to pre- dict missing values given observed data and dependency relationships between variables. To learn a Bayesian network from incomplete data, we use an iterative algorithm that utilises sampling methods and expectation maximization to estimate the distributions and probabilistic dependencies of variables from data with missing values. To test our model we use three catalogs with missing data (SAGE, 2MASS and UBVI) and one complete catalog (MACHO). We examine how classification accuracy changes when information from missing data catalogs is included, how our method compares to traditional missing data approaches and at what computational cost. Integrating these catalogs with missing data we find that classification of variable objects improves by few percent and by 15% for quasar detection while keeping the computational cost the same.

artificial intelligence, catalog, machine learning, (18 more...)

arXiv.org Machine Learning

doi: 10.1088/0004-637X/777/2/83

1310.7868

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.88)

Add feedback

Structured Optimal Transmission Control in Network-coded Two-way Relay Channels

Ding, Ni, Sadeghi, Parastoo, Kennedy, Rodney A.

arXiv.org Machine LearningOct-29-2013

This paper considers a transmission control problem in network-coded two-way relay channels (NC-TWRC), where the relay buffers random symbol arrivals from two users, and the channels are assumed to be fading. The problem is modeled by a discounted infinite horizon Markov decision process (MDP). The objective is to find a transmission control policy that minimizes the symbol delay, buffer overflow and transmission power consumption and error rate simultaneously and in the long run. By using the concepts of submodularity, multimodularity and L-natural convexity, we study the structure of the optimal policy searched by dynamic programming (DP) algorithm. We show that the optimal transmission policy is nondecreasing in queue occupancies or/and channel states under certain conditions such as the chosen values of parameters in the MDP model, channel modeling method, modulation scheme and the preservation of stochastic dominance in the transitions of system states. The results derived in this paper can be used to relieve the high complexity of DP and facilitate real-time control.

artificial intelligence, machine learning, optimization problem, (15 more...)

arXiv.org Machine Learning

1310.7679

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

Trading USDCHF filtered by Gold dynamics via HMM coupling

Lee, Donny

arXiv.org Machine LearningOct-29-2013

We devise a USDCHF trading strategy using the dynamics of gold as a filter. Our strategy involves modelling both USDCHF and gold using a coupled hidden Markov model (CHMM). The observations will be indicators, RSI and CCI, which will be used as triggers for our trading signals. Upon decoding the model in each iteration, we can get the next most probable state and the next most probable observation. Hopefully by taking advantage of intermarket analysis and the Markov property implicit in the model, trading with these most probable values will produce profitable results.

artificial intelligence, chmm, machine learning, (16 more...)

arXiv.org Machine Learning

1308.09

Genre: Research Report (0.50)

Industry: Banking & Finance > Trading (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback