AITopics | Learning Graphical Models

Collaborating Authors

Learning Graphical Models

A graphical model or probabilistic graphical model (PGM) or structured probabilistic model is a probabilistic model for which a graph expresses the conditional dependence structure between random variables. They are commonly used in probability theory, statistics—particularly Bayesian statistics—and machine learning. (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Learning in Unlabeled Networks - An Active Learning and Inference Approach

Kajdanowicz, Tomasz, Michalski, Radosław, Musiał, Katarzyna, Kazienko, Przemysław

arXiv.org Machine LearningOct-5-2015

The task of determining labels of all network nodes based on the knowledge about network structure and labels of some training subset of nodes is called the within-network classification. It may happen that none of the labels of the nodes is known and additionally there is no information about number of classes to which nodes can be assigned. In such a case a subset of nodes has to be selected for initial label acquisition. The question that arises is: "labels of which nodes should be collected and used for learning in order to provide the best classification accuracy for the whole network?". Active learning and inference is a practical framework to study this problem. A set of methods for active learning and inference for within network classification is proposed and validated. The utility score calculation for each node based on network structure is the first step in the process. The scores enable to rank the nodes. Based on the ranking, a set of nodes, for which the labels are acquired, is selected (e.g. by taking top or bottom N from the ranking). The new measure-neighbour methods proposed in the paper suggest not obtaining labels of nodes from the ranking but rather acquiring labels of their neighbours. The paper examines 29 distinct formulations of utility score and selection methods reporting their impact on the results of two collective classification algorithms: Iterative Classification Algorithm and Loopy Belief Propagation. We advocate that the accuracy of presented methods depends on the structural properties of the examined network. We claim that measure-neighbour methods will work better than the regular methods for networks with higher clustering coefficient and worse than regular methods for networks with low clustering coefficient. According to our hypothesis, based on clustering coefficient we are able to recommend appropriate active learning and inference method.

artificial intelligence, machine learning, node, (16 more...)

arXiv.org Machine Learning

1510.0127

Country:

Europe (0.92)
North America > United States (0.67)

Genre: Research Report > New Finding (0.92)

Industry: Health & Medicine (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

Improved Estimation of Class Prior Probabilities through Unlabeled Data

Matloff, Norman

arXiv.org Machine LearningOct-5-2015

Work in the classification literature has shown that in computing a classification function, one need not know the class membership of all observations in the training set; the unlabeled observations still provide information on the marginal distribution of the feature set, and can thus contribute to increased classification accuracy for future observations. The present paper will show that this scheme can also be used for the estimation of class prior probabilities, which would be very useful in applications in which it is difficult or expensive to determine class membership. Both parametric and nonparametric estimators are developed. Asymptotic distributions of the estimators are derived, and it is proven that the use of the unlabeled observations does reduce asymptotic variance. This methodology is also extended to the estimation of subclass probabilities.

artificial intelligence, machine learning, unlabeled data, (17 more...)

arXiv.org Machine Learning

1510.01422

Country: North America > United States > California (0.68)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.48)
(2 more...)

Add feedback

Convex Modeling of Interactions with Strong Heredity

Haris, Asad, Witten, Daniela, Simon, Noah

arXiv.org Machine LearningOct-3-2015

We consider the task of fitting a regression model involving interactions among a potentially large set of covariates, in which we wish to enforce strong heredity. We propose FAMILY, a very general framework for this task. Our proposal is a generalization of several existing methods, such as VANISH [Radchenko and James, 2010], hierNet [Bien et al., 2013], the all-pairs lasso, and the lasso using only main effects. It can be formulated as the solution to a convex optimization problem, which we solve using an efficient alternating directions method of multipliers (ADMM) algorithm. This algorithm has guaranteed convergence to the global optimum, can be easily specialized to any convex penalty function of interest, and allows for a straightforward extension to the setting of generalized linear models. We derive an unbiased estimator of the degrees of freedom of FAMILY, and explore its performance in a simulation study and on an HIV sequence data set.

artificial intelligence, interaction, machine learning, (19 more...)

arXiv.org Machine Learning

doi: 10.1080/10618600.2015.1067217

1410.3517

Genre: Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology > HIV (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Symbol Emergence in Robotics: A Survey

Taniguchi, Tadahiro, Nagai, Takayuki, Nakamura, Tomoaki, Iwahashi, Naoto, Ogata, Tetsuya, Asoh, Hideki

arXiv.org Artificial IntelligenceSep-29-2015

Humans can learn the use of language through physical interaction with their environment and semiotic communication with other people. It is very important to obtain a computational understanding of how humans can form a symbol system and obtain semiotic skills through their autonomous mental development. Recently, many studies have been conducted on the construction of robotic systems and machine-learning methods that can learn the use of language through embodied multimodal interaction with their environment and other systems. Understanding human social interactions and developing a robot that can smoothly communicate with human users in the long term, requires an understanding of the dynamics of symbol systems and is crucially important. The embodied cognition and social interaction of participants gradually change a symbol system in a constructive manner. In this paper, we introduce a field of research called symbol emergence in robotics (SER). SER is a constructive approach towards an emergent symbol system. The emergent symbol system is socially self-organized through both semiotic communications and physical interactions with autonomous cognitive developmental agents, i.e., humans and developmental robots. Specifically, we describe some state-of-art research topics concerning SER, e.g., multimodal categorization, word discovery, and a double articulation analysis, that enable a robot to obtain words and their embodied meanings from raw sensory--motor information, including visual information, haptic information, auditory information, and acoustic speech signals, in a totally unsupervised manner. Finally, we suggest future directions of research in SER.

artificial intelligence, machine learning, symbol system, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1080/01691864.2016.1164622

1509.08973

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Asia > Middle East > Jordan (0.04)
(9 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Education (0.67)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(4 more...)

Add feedback

Learning without Recall: A Case for Log-Linear Learning

Rahimian, Mohammad Amin, Jadbabaie, Ali

arXiv.org Machine LearningSep-29-2015

We analyze a model of learning and belief formation in networks in which agents follow Bayes rule yet they do not recall their history of past observations and cannot reason about how other agents' beliefs are formed. They do so by making rational inferences about their observations which include a sequence of independent and identically distributed private signals as well as the beliefs of their neighboring agents at each time. Fully rational agents would successively apply Bayes rule to the entire history of observations. This leads to forebodingly complex inferences due to lack of knowledge about the global network structure that causes those observations. To address these complexities, we consider a Learning without Recall model, which in addition to providing a tractable framework for analyzing the behavior of rational agents in social networks, can also provide a behavioral foundation for the variety of non-Bayesian update rules in the literature. We present the implications of various choices for time-varying priors of such agents and how this choice affects learning and its rate.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Machine Learning

1509.0899

Country: North America > United States > Pennsylvania (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Learning dynamic Boltzmann machines with spike-timing dependent plasticity

Osogami, Takayuki, Otsuka, Makoto

arXiv.org Machine LearningSep-29-2015

We propose a particularly structured Boltzmann machine, which we refer to as a dynamic Boltzmann machine (DyBM), as a stochastic model of a multi-dimensional time-series. The DyBM can have infinitely many layers of units but allows exact and efficient inference and learning when its parameters have a proposed structure. This proposed structure is motivated by postulates and observations, from biological neural networks, that the synaptic weight is strengthened or weakened, depending on the timing of spikes (i.e., spike-timing dependent plasticity or STDP). We show that the learning rule of updating the parameters of the DyBM in the direction of maximizing the likelihood of given time-series can be interpreted as STDP with long term potentiation and long term depression. The learning rule has a guarantee of convergence and can be performed in a distributed matter (i.e., local in space) with limited memory (i.e., local in time).

artificial intelligence, boltzmann machine, machine learning, (17 more...)

arXiv.org Machine Learning

1509.08634

Country: Asia > Japan (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.98)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)

Add feedback

CiteSeerX: AI in a Digital Library Search Engine

AI MagazineSep-28-2015

Since then, the project has been directed by C. Lee Giles. While it is challenging to rebuild a system like Cite-SeerX from scratch, many of these AI technologies are transferable to other digital libraries and search engines. This is different from arXiv, Harvard ADS, and machine cluster to a private cloud using virtualization PubMed, where papers are submitted by authors or techniques (Wu et al. 2014). CiteSeerX extensively pushed by publishers. Unlike Google Scholar and leverages open source software, which significantly Microsoft Academic Search, where a significant portion reduces development effort. Red Hat of documents have only metadata (such as titles, Enterprise Linux (RHEL) 5 and 6 are the operating authors, and abstracts) available, users have full-text systems for all servers. Tomcat 7 is CiteSeerX keeps its own repository, which used for web service deployment on web and indexing serves cached versions of papers even if their previous servers. MySQL is used as the database management links are not alive any more. In additional to system to store metadata. Apache Solr is used paper downloads, CiteSeerX provides automatically for the index, and the Spring framework is used in extracted metadata and citation context, which the web application. In this section, we highlight four AI solutions that are Document metadata download service is not available leveraged by CiteSeerX and that tackle different challenges from Google Scholar and only recently available in metadata extraction and ingestion modules from Microsoft Academic Search. Finally, CiteSeerX (tagged by C, E, D, and A in figure 1).

citeseerx, machine learning, natural language, (17 more...)

AI Magazine

Country: North America > United States > New Jersey (0.28)

Genre: Research Report (0.93)

Industry:

Information Technology (1.00)
Education > Educational Setting (0.46)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Information Management > Metadata Management (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
(3 more...)

Add feedback

Tractable Fully Bayesian Inference via Convex Optimization and Optimal Transport Theory

Kim, Sanggyun, Mesa, Diego, Ma, Rui, Coleman, Todd P.

arXiv.org Machine LearningSep-28-2015

We consider the problem of transforming samples from one continuous source distribution into samples from another target distribution. We demonstrate with optimal transport theory that when the source distribution can be easily sampled from and the target distribution is log-concave, this can be tractably solved with convex optimization. We show that a special case of this, when the source is the prior and the target is the posterior, is Bayesian inference. Here, we can tractably calculate the normalization constant and draw posterior i.i.d. samples. Remarkably, our Bayesian tractability criterion is simply log concavity of the prior and likelihood: the same criterion for tractable calculation of the maximum a posteriori point estimate. With simulated data, we demonstrate how we can attain the Bayes risk in simulations. With physiologic data, we demonstrate improvements over point estimation in intensive care unit outcome prediction and electroencephalography-based sleep staging.

artificial intelligence, machine learning, posterior, (17 more...)

arXiv.org Machine Learning

1509.08582

Country: North America > United States > California (0.28)

Genre: Research Report (0.51)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Parallel Stochastic Gradient Markov Chain Monte Carlo for Matrix Factorisation Models

Şimşekli, Umut, Koptagel, Hazal, Güldaş, Hakan, Cemgil, A. Taylan, Öztoprak, Figen, Birbil, Ş. İlker

arXiv.org Machine LearningSep-28-2015

For large matrix factorisation problems, we develop a distributed Markov Chain Monte Carlo (MCMC) method based on stochastic gradient Langevin dynamics (SGLD) that we call Parallel SGLD (PSGLD). PSGLD has very favourable scaling properties with increasing data size and is comparable in terms of computational requirements to optimisation methods based on stochastic gradient descent. PSGLD achieves high performance by exploiting the conditional independence structure of the MF models to sub-sample data in a systematic manner as to allow paralleli-sation and distributed computation. We provide a convergence proof of the algorithm and verify its superior performance on various architectures such as Graphics Processing Units, shared memory multi-core systems and multi-computer clusters.

artificial intelligence, machine learning, psgld, (11 more...)

arXiv.org Machine Learning

1506.01418

Country: Asia > Middle East > Republic of Türkiye (0.14)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.61)

Add feedback

A Review of Relational Machine Learning for Knowledge Graphs

Nickel, Maximilian, Murphy, Kevin, Tresp, Volker, Gabrilovich, Evgeniy

arXiv.org Machine LearningSep-28-2015

In this paper, we provide a review of how such statistical models can be "trained" on large knowledge graphs, and then used to predict new facts about the world (which is equivalent to predicting new edges in the graph). In particular, we discuss two fundamentally different kinds of statistical relational models, both of which can scale to massive datasets. The first is based on latent feature models such as tensor factorization and multiway neural networks. The second is based on mining observable patterns in the graph. We also show how to combine these latent and observable models to get improved modeling power at decreased computational cost. Finally, we discuss how such statistical models of graphs can be combined with text-based information extraction methods for automatically constructing knowledge graphs from the Web. To this end, we also discuss Google's Knowledge Vault project as an example of such combination.

data mining, knowledge management, machine learning, (25 more...)

arXiv.org Machine Learning

doi: 10.1109/JPROC.2015.2483592

1503.00759

Country:

North America > United States > California (0.67)
Europe > Germany (0.67)

Genre:

Research Report (1.00)
Overview (0.87)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)
Education (1.00)
Health & Medicine (0.92)

Technology:

Information Technology > Knowledge Management > Knowledge Engineering (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
(11 more...)

Add feedback