AITopics | Learning Graphical Models

Collaborating Authors

Learning Graphical Models

A graphical model or probabilistic graphical model (PGM) or structured probabilistic model is a probabilistic model for which a graph expresses the conditional dependence structure between random variables. They are commonly used in probability theory, statistics—particularly Bayesian statistics—and machine learning. (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

On the underestimation of model uncertainty by Bayesian K-nearest neighbors

Su, Wanhua, Chipman, Hugh, Zhu, Mu

arXiv.org Machine LearningApr-8-2008

When using the K-nearest neighbors method, one often ignores uncertainty in the choice of K. To account for such uncertainty, Holmes and Adams (2002) proposed a Bayesian framework for K-nearest neighbors (KNN). Their Bayesian KNN (BKNN) approach uses a pseudo-likelihood function, and standard Markov chain Monte Carlo (MCMC) techniques to draw posterior samples. Holmes and Adams (2002) focused on the performance of BKNN in terms of misclassification error but did not assess its ability to quantify uncertainty. We present some evidence to show that BKNN still significantly underestimates model uncertainty.

artificial intelligence, machine learning, test point, (17 more...)

arXiv.org Machine Learning

0804.1325

Country: North America > Canada > Ontario (0.14)

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)

Add feedback

Creating Relational Data from Unstructured and Ungrammatical Data Sources

Michelson, M., Knoblock, C. A.

Journal of Artificial Intelligence ResearchMar-28-2008

In order for agents to act on behalf of users, they will have to retrieve and integrate vast amounts of textual data on the World Wide Web. However, much of the useful data on the Web is neither grammatical nor formally structured, making querying difficult. Examples of these types of data sources are online classifieds like Craigslist and auction item listings like eBay. We call this unstructured, ungrammatical data "posts." The unstructured nature of posts makes query and integration difficult because the attributes are embedded within the text. Also, these attributes do not conform to standardized values, which prevents queries based on a common attribute value. The schema is unknown and the values may vary dramatically making accurate search difficult. Creating relational data for easy querying requires that we define a schema for the embedded attributes and extract values from the posts while standardizing these values. Traditional information extraction (IE) is inadequate to perform this task because it relies on clues from the data, such as structure or natural language, neither of which are found in posts. Furthermore, traditional information extraction does not incorporate data cleaning, which is necessary to accurately query and integrate the source. The two-step approach described in this paper creates relational data sets from unstructured and ungrammatical text by addressing both issues. To do this, we require a set of known entities called a "reference set." The first step aligns each post to each member of each reference set. This allows our algorithm to define a schema over the post and include standard values for the attributes defined by this schema. The second step performs information extraction for the attributes, including attributes not easily represented by reference sets, such as a price. In this manner we create a relational structure over previously unstructured data, supporting deep and accurate queries over the data as well as standard values for integration. Our experimental results show that our technique matches the posts to the reference set accurately and efficiently and outperforms state-of-the-art extraction systems on the extraction task from posts.

algorithm, extraction, phoebus, (15 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.2409

AI Access Foundation

10541

Journal of Artificial Intelligence Research

Country:

North America > United States > New York (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(4 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Education > Educational Setting > Online (0.67)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Databases (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
(5 more...)

Add feedback

First Order Decision Diagrams for Relational MDPs

Wang, C., Joshi, S., Khardon, R.

Journal of Artificial Intelligence ResearchMar-25-2008

Markov decision processes capture sequential decision making under uncertainty, where an agent must choose actions so as to optimize long term reward. The paper studies efficient reasoning mechanisms for Relational Markov Decision Processes (RMDP) where world states have an internal relational structure that can be naturally described in terms of objects and relations among them. Two contributions are presented. First, the paper develops First Order Decision Diagrams (FODD), a new compact representation for functions over relational structures, together with a set of operators to combine FODDs, and novel reduction techniques to keep the representation small. Second, the paper shows how FODDs can be used to develop solutions for RMDPs, where reasoning is performed at the abstract level and the resulting optimal policy is independent of domain size (number of objects) or instantiation. In particular, a variant of the value iteration algorithm is developed by using special operations over FODDs, and the algorithm is shown to converge to the optimal policy.

diagram, fodd, reduction, (15 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.2489

AI Access Foundation

10538

Journal of Artificial Intelligence Research

Country: North America > United States > Massachusetts > Middlesex County > Medford (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Gesture Salience as a Hidden Variable for Coreference Resolution and Keyframe Extraction

Eisenstein, J., Barzilay, R., Davis, R.

Journal of Artificial Intelligence ResearchFeb-29-2008

Gesture is a non-verbal modality that can contribute crucial information to the understanding of natural language. But not all gestures are informative, and non-communicative hand motions may confuse natural language processing (NLP) and impede learning. People have little difficulty ignoring irrelevant hand movements and focusing on meaningful gestures, suggesting that an automatic system could also be trained to perform this task. However, the informativeness of a gesture is context-dependent and labeling enough data to cover all cases would be expensive. We present conditional modality fusion, a conditional hidden-variable model that learns to predict which gestures are salient for coreference resolution, the task of determining whether two noun phrases refer to the same semantic entity. Moreover, our approach uses only coreference annotations, and not annotations of gesture salience itself. We show that gesture features improve performance on coreference resolution, and that by attending only to gestures that are salient, our method achieves further significant gains. In addition, we show that the model of gesture salience learned in the context of coreference accords with human intuition, by demonstrating that gestures judged to be salient by our model can be used successfully to create multimedia keyframe summaries of video. These summaries are similar to those created by human raters, and significantly outperform summaries produced by baselines from the literature.

coreference resolution, noun phrase, proceedings, (12 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.2450

AI Access Foundation

10536

Journal of Artificial Intelligence Research

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Asia > Japan > Honshū > Chūbu > Toyama Prefecture > Toyama (0.04)
Asia > Middle East > Jordan (0.04)
(4 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
(5 more...)

Add feedback

Learning Balanced Mixtures of Discrete Distributions with Small Sample

Zhou, Shuheng

arXiv.org Machine LearningFeb-10-2008

We study the problem of partitioning a small sample of $n$ individuals from a mixture of $k$ product distributions over a Boolean cube $\{0, 1\}^K$ according to their distributions. Each distribution is described by a vector of allele frequencies in $\R^K$. Given two distributions, we use $\gamma$ to denote the average $\ell_2^2$ distance in frequencies across $K$ dimensions, which measures the statistical divergence between them. We study the case assuming that bits are independently distributed across $K$ dimensions. This work demonstrates that, for a balanced input instance for $k = 2$, a certain graph-based optimization function returns the correct partition with high probability, where a weighted graph $G$ is formed over $n$ individuals, whose pairwise hamming distances between their corresponding bit vectors define the edge weights, so long as $K = \Omega(\ln n/\gamma)$ and $Kn = \tilde\Omega(\ln n/\gamma^2)$. The function computes a maximum-weight balanced cut of $G$, where the weight of a cut is the sum of the weights across all edges in the cut. This result demonstrates a nice property in the high-dimensional feature space: one can trade off the number of features that are required with the size of the sample to accomplish certain tasks like clustering.

artificial intelligence, diff, machine learning, (18 more...)

arXiv.org Machine Learning

0802.1244

Country: North America > United States (0.28)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)

Add feedback

CUI Networks: A Graphical Representation for Conditional Utility Independence

Engel, Y., Wellman, M. P.

Journal of Artificial Intelligence ResearchJan-24-2008

We introduce CUI networks, a compact graphical representation of utility functions over multiple attributes. CUI networks model multiattribute utility functions using the well-studied and widely applicable utility independence concept. We show how conditional utility independence leads to an effective functional decomposition that can be exhibited graphically, and how local, compact data at the graph nodes can be used to calculate joint utility. We discuss aspects of elicitation, network construction, and optimization, and contrast our new representation with previous graphical preference modeling.

cui network, decomposition, utility function, (16 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.2360

AI Access Foundation

10528

Journal of Artificial Intelligence Research

Country:

North America > United States > New York (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
(2 more...)

Technology:

Information Technology > Decision Support Systems (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

iBOA: The Incremental Bayesian Optimization Algorithm

Pelikan, Martin, Sastry, Kumara, Goldberg, David E.

arXiv.org Artificial IntelligenceJan-20-2008

This paper proposes the incremental Bayesian optimization algorithm (iBOA), which modifies standard BOA by removing the population of solutions and using incremental updates of the Bayesian network. iBOA is shown to be able to learn and exploit unrestricted Bayesian networks using incremental techniques for updating both the structure as well as the parameters of the probabilistic model. This represents an important step toward the design of competent incremental estimation of distribution algorithms that can solve difficult nearly decomposable problems scalably and reliably.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Artificial Intelligence

0801.3113

Country:

North America > United States > Missouri (0.28)
North America > United States > Illinois > Champaign County (0.14)

Genre: Research Report (0.82)

Industry: Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Planning with Durative Actions in Stochastic Domains

Mausam,, Weld, D. S.

Journal of Artificial Intelligence ResearchJan-19-2008

Probabilistic planning problems are typically modeled as a Markov Decision Process (MDP). MDPs, while an otherwise expressive model, allow only for sequential, non-durative actions. This poses severe restrictions in modeling and solving a real world planning problem. We extend the MDP model to incorporate -- 1) simultaneous action execution, 2) durative actions, and 3) stochastic durations. We develop several algorithms to combat the computational explosion introduced by these features. The key theoretical ideas used in building these algorithms are -- modeling a complex problem as an MDP in extended state/action space, pruning of irrelevant actions, sampling of relevant actions, using informed heuristics to guide the search, hybridizing different planners to achieve benefits of both, approximating the problem and replanning. Our empirical evaluation illuminates the different merits in using various algorithms, viz., optimality, empirical closeness to optimality, theoretical error bounds, and speed.

action combination, algorithm, concurrency, (13 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.2269

AI Access Foundation

10527

Journal of Artificial Intelligence Research

Country:

North America > United States > Washington > King County > Seattle (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Oklahoma > Payne County > Cushing (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Add feedback

Parameterizations and fitting of bi-directed graph models to categorical data

Lupparelli, Monia, Marchetti, Giovanni M., Bergsma, Wicher P.

arXiv.org Machine LearningJan-9-2008

We discuss two parameterizations of models for marginal independencies for discrete distributions which are representable by bi-directed graph models, under the global Markov property. Such models are useful data analytic tools especially if used in combination with other graphical models. The first parameterization, in the saturated case, is also known as the multivariate logistic transformation, the second is a variant that allows, in some (but not all) cases, variation independent parameters. An algorithm for maximum likelihood fitting is proposed, based on an extension of the Aitchison and Silvey method.

artificial intelligence, machine learning, parameterization, (14 more...)

arXiv.org Machine Learning

0801.1440

Country:

North America > United States (0.28)
Europe > United Kingdom (0.28)
Europe > Austria (0.28)

Genre:

Research Report (0.82)
Questionnaire & Opinion Survey (0.68)

Industry: Health & Medicine > Therapeutic Area (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)

Add feedback

Sparse Multinomial Logistic Regression via Bayesian L1 Regularisation

Cawley, Gavin C., Talbot, Nicola L., Girolami, Mark

Neural Information Processing SystemsDec-31-2007

Multinomial logistic regression provides the standard penalised maximumlikelihood solution to multi-class pattern recognition problems. More recently, the development of sparse multinomial logistic regression models has found application in text processing and microarray classification, where explicit identification of the most informative features is of value. In this paper, we propose a sparse multinomial logistic regression method, in which the sparsity arises from the use of a Laplace prior, but where the usual regularisation parameter is integrated out analytically. Evaluation over a range of benchmark datasets reveals this approach results in similar generalisation performance to that obtained using cross-validation, but at greatly reduced computational expense.

artificial intelligence, machine learning, regularisation parameter, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > New York (0.04)
Europe > United Kingdom > England > Norfolk > Norwich (0.04)
North America > United States > Florida > Monroe County > Key West (0.04)
(3 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.89)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback