AITopics

1205.4476

Country:

North America > United States (0.15)
Europe (0.14)

Genre:

Research Report > New Finding (0.90)
Research Report > Experimental Study (0.55)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.87)

Halpern, Joseph Y., Leung, Samantha

Weighted Sets of Probabilities and Minimax Weighted Expected Regret: New Approaches for Representing Uncertainty and Making Decisions

arXiv.org Artificial IntelligenceFeb-21-2013

We consider a setting where an agent's uncertainty is represented by a set of probability measures, rather than a single measure. Measure-by-measure updating of such a set of measures upon acquiring new information is well-known to suffer from problems; agents are not always able to learn appropriately. To deal with these problems, we propose using weighted sets of probabilities: a representation where each measure is associated with a weight, which denotes its significance. We describe a natural approach to updating in such a situation and a natural approach to determining the weights. We then show how this representation can be used in decision-making, by modifying a standard approach to decision making -- minimizing expected regret -- to obtain minimax weighted expected regret (MWER). We provide an axiomatization that characterizes preferences induced by MWER both in the static and dynamic case.

artificial intelligence, probability, probability measure, (16 more...)

arXiv.org Artificial Intelligence

1302.5681

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.61)

Bazerque, Juan Andres, Giannakis, Georgios B.

Nonparametric Basis Pursuit via Sparse Kernel-based Learning

arXiv.org Machine LearningFeb-21-2013

Signal processing tasks as fundamental as sampling, reconstruction, minimum mean-square error interpolation and prediction can be viewed under the prism of reproducing kernel Hilbert spaces. Endowing this vantage point with contemporary advances in sparsity-aware modeling and processing, promotes the nonparametric basis pursuit advocated in this paper as the overarching framework for the confluence of kernel-based learning (KBL) approaches leveraging sparse linear regression, nuclear-norm regularization, and dictionary learning. The novel sparse KBL toolbox goes beyond translating sparse parametric approaches to their nonparametric counterparts, to incorporate new possibilities such as multi-kernel selection and matrix smoothing. The impact of sparse KBL to signal processing applications is illustrated through test cases from cognitive radio sensing, microarray data imputation, and network traffic prediction.

artificial intelligence, kernel, machine learning, (14 more...)

1302.5449

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)

Genre: Research Report (0.50)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.87)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Communications > Networks (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Journal of Artificial Intelligence ResearchFeb-20-2013

Toward Supervised Anomaly Detection

Goernitz, N., Kloft, M., Rieck, K., Brefeld, U.

Anomaly detection is being regarded as an unsupervised learning task as anomalies stem from adversarial or unlikely events with unknown distributions. However, the predictive performance of purely unsupervised anomaly detection often fails to match the required detection rates in many tasks and there exists a need for labeled data to guide the model generation. Our first contribution shows that classical semi-supervised approaches, originating from a supervised classifier, are inappropriate and hardly detect new and unknown anomalies. We argue that semi-supervised anomaly detection needs to ground on the unsupervised learning paradigm and devise a novel algorithm that meets this requirement. Although being intrinsically non-convex, we further show that the optimization problem has a convex equivalent under relatively mild assumptions. Additionally, we propose an active learning strategy to automatically filter candidates for labeling. In an empirical study on network intrusion detection data, we observe that the proposed learning methodology requires much less labeled data than the state-of-the-art, while achieving higher detection accuracies.

anomaly detection, detection, optimization problem, (11 more...)

doi: 10.1613/jair.3623

AI Access Foundation

10802

Country:

North America > United States > New York > New York County > New York City (0.14)
Europe > Germany > Lower Saxony > Gottingen (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
(8 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Journal of Artificial Intelligence ResearchFeb-20-2013

Integrative Semantic Dependency Parsing via Efficient Large-scale Feature Selection

Zhao, H., Zhang, X., Kit, C.

Semantic parsing, i.e., the automatic derivation of meaning representation such as an instantiated predicate-argument structure for a sentence, plays a critical role in deep processing of natural language. Unlike all other top systems of semantic dependency parsing that have to rely on a pipeline framework to chain up a series of submodels each specialized for a specific subtask, the one presented in this article integrates everything into one model, in hopes of achieving desirable integrity and practicality for real applications while maintaining a competitive performance. This integrative approach tackles semantic parsing as a word pair classification problem using a maximum entropy classifier. We leverage adaptive pruning of argument candidates and large-scale feature selection engineering to allow the largest feature space ever in use so far in this field, it achieves a state-of-the-art performance on the evaluation data set for CoNLL-2008 shared task, on top of all but one top pipeline system, confirming its feasibility and effectiveness.

dprel, predicate, template, (16 more...)

doi: 10.1613/jair.3717

AI Access Foundation

10801

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
North America > United States > Colorado > Boulder County > Boulder (0.04)
North America > United States > Ohio > Franklin County > Columbus (0.04)
(11 more...)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Journal of Artificial Intelligence ResearchFeb-20-2013

Generating Extractive Summaries of Scientific Paradigms

Qazvinian, V., Radev, D. R., Mohammad, S. M., Dorr, B., Zajic, D., Whidby, M., Moon, T.

Researchers and scientists increasingly find themselves in the position of having to quickly understand large amounts of technical material. Our goal is to effectively serve this need by using bibliometric text mining and summarization techniques to generate summaries of scientific literature. We show how we can use citations to produce automatically generated, readily consumable, technical extractive summaries. We first propose C-LexRank, a model for summarizing single scientific articles based on citations, which employs community detection and extracts salient information-rich sentences. Next, we further extend our experiments to summarize a set of papers, which cover the same scientific topic. We generate extractive summaries of a set of Question Answering (QA) and Dependency Parsing (DP) papers, their abstracts, and their citation sentences and show that citations have unique information amenable to creating a summary.

citation sentence, factoid, proceedings, (13 more...)

doi: 10.1613/jair.3732

AI Access Foundation

10800

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.28)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)
North America > United States > Maryland > Prince George's County > College Park (0.14)
(21 more...)

Genre:

Research Report > New Finding (1.00)
Overview (0.68)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

Qian, Jing, Saligrama, Venkatesh

Spectral Clustering with Unbalanced Data

Spectral clustering (SC) and graph-based semi-supervised learning (SSL) algorithms are sensitive to how graphs are constructed from data. In particular if the data has proximal and unbalanced clusters these algorithms can lead to poor performance on well-known graphs such as $k$-NN, full-RBF, $\epsilon$-graphs. This is because the objectives such as Ratio-Cut (RCut) or normalized cut (NCut) attempt to tradeoff cut values with cluster sizes, which are not tailored to unbalanced data. We propose a novel graph partitioning framework, which parameterizes a family of graphs by adaptively modulating node degrees in a $k$-NN graph. We then propose a model selection scheme to choose sizable clusters which are separated by smallest cut values. Our framework is able to adapt to varying levels of unbalancedness of data and can be naturally used for small cluster detection. We theoretically justify our ideas through limit cut analysis. Unsupervised and semi-supervised experiments on synthetic and real data sets demonstrate the superiority of our method.

artificial intelligence, graph, machine learning, (17 more...)

1302.5134

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.67)

Rippel, Oren, Adams, Ryan Prescott

High-Dimensional Probability Estimation with Deep Density Models

One of the fundamental problems in machine learning is the estimation of a probability distribution from data. Many techniques have been proposed to study the structure of data, most often building around the assumption that observations lie on a lower-dimensional manifold of high probability. It has been more difficult, however, to exploit this insight to build explicit, tractable density models for high-dimensional data. In this paper, we introduce the deep density model (DDM), a new approach to density estimation. We exploit insights from deep learning to construct a bijective map to a representation space, under which the transformation of the distribution of the data is approximately factorized and has identical and known marginal densities. The simplicity of the latent distribution under the model allows us to feasibly explore it, and the invertibility of the map to characterize contraction of measure across it. This enables us to compute normalized densities for out-of-sample data. This combination of tractability and flexibility allows us to tackle a variety of probabilistic tasks on high-dimensional datasets, including: rapid computation of normalized densities at test-time without evaluating a partition function; generation of samples without MCMC; and characterization of the joint entropy of the data.

artificial intelligence, deep learning, machine learning, (14 more...)

1302.5125

Country: North America (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

John, George H., Langley, Pat

Estimating Continuous Distributions in Bayesian Classifiers

When modeling a probability distribution with a Bayesian network, we are faced with the problem of how to handle continuous variables. Most previous work has either solved the problem by discretizing, or assumed that the data are generated by a single Gaussian. In this paper we abandon the normality assumption and instead use statistical methods for nonparametric density estimation. For a naive Bayesian classifier, we present experimental results on a variety of natural and artificial domains, comparing two methods of density estimation: assuming normality and modeling each conditional distribution with a single Gaussian; and using nonparametric kernel density estimation. We observe large reductions in error on several natural and artificial data sets, which suggests that kernel estimation is a useful tool for learning Bayesian models.

artificial intelligence, bayesian inference, machine learning, (15 more...)

1302.4964

Country: North America > United States > California > Santa Clara County (0.14)

Genre: Research Report (0.83)

Industry: Health & Medicine (0.95)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Breaking the Small Cluster Barrier of Graph Clustering

Ailon, Nir, Chen, Yudong, Huan, Xu

This paper considers a classic problem in machine learning and theoretical computer science, namely graph clustering, i.e., given an undirected unweighted graph, partition the nodes into disjoint clusters, so that the density of edges within one cluster is higher than those across clusters. Graph clustering arises naturally in many application across science and engineering. Some prominent examples include community detection in social network Mishra et al. [2007], submarket identification in E-commerce and sponsored search Yahoo!-Inc [2009], and co-authorship analysis in analyzing document database Ester et al. [1995], among others. From a purely binary classification theoretical point of view, the edges of the graph are (noisy) labels of similarity or affinity between pairs of objects, and the concept class consists of clusterings of the objects (encoded graphically by identifying clusters with cliques). Many theoretical results in graph clustering [e.g., Boppana, 1987, Chen et al., 2012, McSherry, 2001] consider the planted partition model, in which the edges are generated randomly; see Section 1.1 for more details. While numerous different methods have been proposed, their performance guarantees all share the following manner - under certain condition of the density of edges (within clusters and across clusters), the proposed method succeeds to recover the correct clusters exactly if all clusters are larger than a threshold size, typically Ω( n).

artificial intelligence, machine learning, probability, (19 more...)

1302.4549

Country: Asia (0.28)

Genre: Research Report (1.00)

Industry: Information Technology > Services (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)