AITopics

doi: 10.1613/jair.533

1105.3635

Country: North America > United States > California (1.00)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

arXiv.org Artificial IntelligenceMay-15-2011

Feature Selection for MAUC-Oriented Classification Systems

Wang, Rui, Tang, Ke

Feature selection is an important pre-processing step for many pattern classification tasks. Traditionally, feature selection methods are designed to obtain a feature subset that can lead to high classification accuracy. However, classification accuracy has recently been shown to be an inappropriate performance metric of classification systems in many cases. Instead, the Area Under the receiver operating characteristic Curve (AUC) and its multi-class extension, MAUC, have been proved to be better alternatives. Hence, the target of classification system design is gradually shifting from seeking a system with the maximum classification accuracy to obtaining a system with the maximum AUC/MAUC. Previous investigations have shown that traditional feature selection methods need to be modified to cope with this new objective. These methods most often are restricted to binary classification problems only. In this study, a filter feature selection method, namely MAUC Decomposition based Feature Selection (MDFS), is proposed for multi-class classification problems. To the best of our knowledge, MDFS is the first method specifically designed to select features for building classification systems with maximum MAUC. Extensive empirical results demonstrate the advantage of MDFS over several compared feature selection methods.

artificial intelligence, feature selection method, machine learning, (13 more...)

doi: 10.1016/j.neucom.2012.01.013

1105.2943

Country: North America > United States (0.94)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

Liu, Ying, Chandrasekaran, Venkat, Anandkumar, Animashree, Willsky, Alan S.

Feedback Message Passing for Inference in Gaussian Graphical Models

arXiv.org Artificial IntelligenceMay-10-2011

While loopy belief propagation (LBP) performs reasonably well for inference in some Gaussian graphical models with cycles, its performance is unsatisfactory for many others. In particular for some models LBP does not converge, and in general when it does converge, the computed variances are incorrect (except for cycle-free graphs for which belief propagation (BP) is non-iterative and exact). In this paper we propose {\em feedback message passing} (FMP), a message-passing algorithm that makes use of a special set of vertices (called a {\em feedback vertex set} or {\em FVS}) whose removal results in a cycle-free graph. In FMP, standard BP is employed several times on the cycle-free subgraph excluding the FVS while a special message-passing scheme is used for the nodes in the FVS. The computational complexity of exact inference is $O(k^2n)$, where $k$ is the number of feedback nodes, and $n$ is the total number of nodes. When the size of the FVS is very large, FMP is intractable. Hence we propose {\em approximate FMP}, where a pseudo-FVS is used instead of an FVS, and where inference in the non-cycle-free graph obtained by removing the pseudo-FVS is carried out approximately using LBP. We show that, when approximate FMP converges, it yields exact means and variances on the pseudo-FVS and exact means throughout the remainder of the graph. We also provide theoretical results on the convergence and accuracy of approximate FMP. In particular, we prove error bounds on variance computation. Based on these theoretical results, we design efficient algorithms to select a pseudo-FVS of bounded size. The choice of the pseudo-FVS allows us to explicitly trade off between efficiency and accuracy. Experimental results show that using a pseudo-FVS of size no larger than $\log(n)$, this procedure converges much more often, more quickly, and provides more accurate results than LBP on the entire graph.

artificial intelligence, machine learning, node, (16 more...)

doi: 10.1109/TSP.2012.2195656

1105.1853

Country:

Europe > United Kingdom > England (0.14)
North America > United States > Texas (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > California > Orange County > Irvine (0.14)

Genre: Research Report > New Finding (0.66)

Industry: Energy > Oil & Gas (1.00)

Technology:

Information Technology > Architecture > Distributed Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

arXiv.org Machine LearningMay-10-2011

On approximation of smoothing probabilities for hidden Markov models

Lember, J.

We consider the smoothing probabilities of hidden Markov model (HMM). We show that under fairly general conditions for HMM, the exponential forgetting still holds, and the smoothing probabilities can be well approximated with the ones of double sided HMM. This makes it possible to use ergodic theorems. As an applications we consider the pointwise maximum a posteriori segmentation, and show that the corresponding risks converge.

artificial intelligence, machine learning, markov model, (16 more...)

0910.4636

Country:

North America > United States (0.28)
Europe (0.28)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Volant, Stevenn, Magniette, Marie-Laure Martin, Robin, Stéphane

Variational Bayes approach for model aggregation in unsupervised classification with Markovian dependency

arXiv.org Machine LearningMay-4-2011

We consider a binary unsupervised classification problem where each observation is associated with an unobserved label that we want to retrieve. More precisely, we assume that there are two groups of observation: normal and abnormal. The `normal' observations are coming from a known distribution whereas the distribution of the `abnormal' observations is unknown. Several models have been developed to fit this unknown distribution. In this paper, we propose an alternative based on a mixture of Gaussian distributions. The inference is done within a variational Bayesian framework and our aim is to infer the posterior probability of belonging to the class of interest. To this end, it makes no sense to estimate the mixture component number since each mixture model provides more or less relevant information to the posterior probability estimation. By computing a weighted average (named aggregated estimator) over the model collection, Bayesian Model Averaging (BMA) is one way of combining models in order to account for information provided by each model. The aim is then the estimation of the weights and the posterior probability for one specific model. In this work, we derive optimal approximations of these quantities from the variational theory and propose other approximations of the weights. To perform our method, we consider that the data are dependent (Markovian dependency) and hence we consider a Hidden Markov Model. A simulation study is carried out to evaluate the accuracy of the estimates in terms of classification. We also present an application to the analysis of public health surveillance systems.

artificial intelligence, bayesian inference, machine learning, (18 more...)

1105.076

Country: North America (0.46)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Public Health (1.00)
Health & Medicine > Epidemiology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Choi, David S., Wolfe, Patrick J., Airoldi, Edoardo M.

Stochastic blockmodels with growing number of classes

arXiv.org Machine LearningApr-30-2011

We present asymptotic and finite-sample results on the use of stochastic blockmodels for the analysis of network data. We show that the fraction of misclassified network nodes converges in probability to zero under maximum likelihood fitting when the number of classes is allowed to grow as the root of the network size and the average network degree grows at least poly-logarithmically in this size. We also establish finite-sample confidence bounds on maximum-likelihood blockmodel parameter estimates from data comprising independent Bernoulli random variates; these results hold uniformly over class assignment. We provide simulations verifying the conditions sufficient for our results, and conclude by fitting a logit parameterization of a stochastic blockmodel with covariates to a network data example comprising a collection of Facebook profiles, resulting in block estimates that reveal residual structure.

artificial intelligence, machine learning, social media, (19 more...)

doi: 10.1093/biomet/asr053

1011.4644

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology > Services (0.35)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

Mannor, Shie, Tsitsiklis, John

Mean-Variance Optimization in Markov Decision Processes

arXiv.org Artificial IntelligenceApr-29-2011

We consider finite horizon Markov decision processes under performance measures that involve both the mean and the variance of the cumulative reward. We show that either randomized or history-based policies can improve performance. We prove that the complexity of computing a policy that maximizes the mean reward under a variance constraint is NPhard for some cases, and strongly NPhard for others. We finally offer pseudopolynomial exact and approximation algorithms. The classical theory of Markov decision processes (MDPs) deals with the maximization of the cumulative (possibly discounted) expected reward, to be denoted by W. However, a risk-averse decision maker may be interested in additional distributional properties of W. In this paper, we focus on the case where the decision maker is interested in both the mean and the variance of the cumulative reward, and we explore the associated computational issues. Risk aversion in MDPs is of course an old subject. Problems of this type can be handled by state augmentation (e.g., Bertsekas, 1995), namely, by introducing an auxiliary state variable that keeps track of the cumulative past reward. In a few special cases, e.g., with an exponential utility function, state augmentation is unnecessary, and optimal policies can be found by solving a modified Bellman equation (Chung & Sobel, 1987).

algorithm, artificial intelligence, machine learning, (16 more...)

1104.5601

Country: North America > United States (1.00)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

arXiv.org Machine LearningApr-28-2011

Notes on a New Philosophy of Empirical Science

Burfoot, Daniel

This book presents a methodology and philosophy of empirical science based on large scale lossless data compression. In this view a theory is scientific if it can be used to build a data compression program, and it is valuable if it can compress a standard benchmark database to a small size, taking into account the length of the compressor itself. This methodology therefore includes an Occam principle as well as a solution to the problem of demarcation. Because of the fundamental difficulty of lossless compression, this type of research must be empirical in nature: compression can only be achieved by discovering and characterizing empirical regularities in the data. Because of this, the philosophy provides a way to reformulate fields such as computer vision and computational linguistics as empirical sciences: the former by attempting to compress databases of natural images, the latter by attempting to compress large text databases. The book argues that the rigor and objectivity of the compression principle should set the stage for systematic progress in these fields. The argument is especially strong in the context of computer vision, which is plagued by chronic problems of evaluation. The book also considers the field of machine learning. Here the traditional approach requires that the models proposed to solve learning problems be extremely simple, in order to avoid overfitting. However, the world may contain intrinsically complex phenomena, which would require complex models to understand. The compression philosophy can justify complex models because of the large quantity of data being modeled (if the target database is 100 Gb, it is easy to justify a 10 Mb model). The complex models and abstractions learned on the basis of the raw data (images, language, etc) can then be reused to solve any specific learning problem, such as face recognition or machine translation.

artificial intelligence, machine learning, natural language, (25 more...)

1104.5466

Country:

Europe (0.67)
Asia (0.67)
North America > United States > New York (0.45)

Genre:

Summary/Review (1.00)
Research Report > New Finding (1.00)
Overview (1.00)
Instructional Material (1.00)

Industry:

Transportation > Passenger (1.00)
Transportation > Ground > Road (1.00)
Media (1.00)
(11 more...)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (1.00)
(11 more...)

Rao, Nikhil S, Nowak, Robert D., Wright, Stephen J., Kingsbury, Nick G.

Convex Approaches to Model Wavelet Sparsity Patterns

arXiv.org Machine LearningApr-22-2011

Statistical dependencies among wavelet coefficients are commonly represented by graphical models such as hidden Markov trees(HMTs). However, in linear inverse problems such as deconvolution, tomography, and compressed sensing, the presence of a sensing or observation matrix produces a linear mixing of the simple Markovian dependency structure. This leads to reconstruction problems that are non-convex optimizations. Past work has dealt with this issue by resorting to greedy or suboptimal iterative reconstruction methods. In this paper, we propose new modeling approaches based on group-sparsity penalties that leads to convex optimizations that can be solved exactly and efficiently. We show that the methods we develop perform significantly better in deconvolution and compressed sensing applications, while being as computationally efficient as standard coefficient-wise approaches such as lasso.

artificial intelligence, data quality, machine learning, (16 more...)

1104.4385

Country: North America > United States > Wisconsin (0.14)

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.50)
Information Technology > Data Science > Data Quality > Data Transformation (0.39)

Journal of Artificial Intelligence ResearchApr-19-2011

Exploiting Structure in Weighted Model Counting Approaches to Probabilistic Inference

Li, W., Poupart, P., van Beek, P.

Previous studies have demonstrated that encoding a Bayesian network into a SAT formula and then performing weighted model counting using a backtracking search algorithm can be an effective method for exact inference. In this paper, we present techniques for improving this approach for Bayesian networks with noisy-OR and noisy-MAX relations-- two relations that are widely used in practice as they can dramatically reduce the number of probabilities one needs to specify. In particular, we present two SAT encodings for noisy-OR and two encodings for noisy-MAX that exploit the structure or semantics of the relations to improve both time and space efficiency, and we prove the correctness of the encodings. We experimentally evaluated our techniques on large-scale real and randomly generated Bayesian networks. On these benchmarks, our techniques gave speedups of up to two orders of magnitude over the best previous approaches for networks with noisy-OR/MAX relations and scaled up to larger networks. As well, our techniques extend the weighted model counting approach for exact inference to networks that were previously intractable for the approach.

bayesian network, indicator variable, relation, (16 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.3232

AI Access Foundation

10701

Journal of Artificial Intelligence Research

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Florida > Orange County > Orlando (0.04)
North America > Canada > Ontario > Waterloo Region > Waterloo (0.04)
Europe > Spain > Galicia > Madrid (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)