AITopics

1205.4476

Country:

North America > United States (0.15)
Europe (0.14)

Genre:

Research Report > New Finding (0.90)
Research Report > Experimental Study (0.55)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.87)

Bazerque, Juan Andres, Giannakis, Georgios B.

Nonparametric Basis Pursuit via Sparse Kernel-based Learning

arXiv.org Machine LearningFeb-21-2013

Signal processing tasks as fundamental as sampling, reconstruction, minimum mean-square error interpolation and prediction can be viewed under the prism of reproducing kernel Hilbert spaces. Endowing this vantage point with contemporary advances in sparsity-aware modeling and processing, promotes the nonparametric basis pursuit advocated in this paper as the overarching framework for the confluence of kernel-based learning (KBL) approaches leveraging sparse linear regression, nuclear-norm regularization, and dictionary learning. The novel sparse KBL toolbox goes beyond translating sparse parametric approaches to their nonparametric counterparts, to incorporate new possibilities such as multi-kernel selection and matrix smoothing. The impact of sparse KBL to signal processing applications is illustrated through test cases from cognitive radio sensing, microarray data imputation, and network traffic prediction.

artificial intelligence, kernel, machine learning, (14 more...)

1302.5449

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)

Genre: Research Report (0.50)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.87)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Communications > Networks (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Journal of Artificial Intelligence ResearchFeb-20-2013

Toward Supervised Anomaly Detection

Goernitz, N., Kloft, M., Rieck, K., Brefeld, U.

Anomaly detection is being regarded as an unsupervised learning task as anomalies stem from adversarial or unlikely events with unknown distributions. However, the predictive performance of purely unsupervised anomaly detection often fails to match the required detection rates in many tasks and there exists a need for labeled data to guide the model generation. Our first contribution shows that classical semi-supervised approaches, originating from a supervised classifier, are inappropriate and hardly detect new and unknown anomalies. We argue that semi-supervised anomaly detection needs to ground on the unsupervised learning paradigm and devise a novel algorithm that meets this requirement. Although being intrinsically non-convex, we further show that the optimization problem has a convex equivalent under relatively mild assumptions. Additionally, we propose an active learning strategy to automatically filter candidates for labeling. In an empirical study on network intrusion detection data, we observe that the proposed learning methodology requires much less labeled data than the state-of-the-art, while achieving higher detection accuracies.

anomaly detection, detection, optimization problem, (11 more...)

doi: 10.1613/jair.3623

AI Access Foundation

10802

Country:

North America > United States > New York > New York County > New York City (0.14)
Europe > Germany > Lower Saxony > Gottingen (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
(8 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Journal of Artificial Intelligence ResearchFeb-20-2013

Generating Extractive Summaries of Scientific Paradigms

Qazvinian, V., Radev, D. R., Mohammad, S. M., Dorr, B., Zajic, D., Whidby, M., Moon, T.

Researchers and scientists increasingly find themselves in the position of having to quickly understand large amounts of technical material. Our goal is to effectively serve this need by using bibliometric text mining and summarization techniques to generate summaries of scientific literature. We show how we can use citations to produce automatically generated, readily consumable, technical extractive summaries. We first propose C-LexRank, a model for summarizing single scientific articles based on citations, which employs community detection and extracts salient information-rich sentences. Next, we further extend our experiments to summarize a set of papers, which cover the same scientific topic. We generate extractive summaries of a set of Question Answering (QA) and Dependency Parsing (DP) papers, their abstracts, and their citation sentences and show that citations have unique information amenable to creating a summary.

citation sentence, factoid, proceedings, (13 more...)

doi: 10.1613/jair.3732

AI Access Foundation

10800

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.28)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)
North America > United States > Maryland > Prince George's County > College Park (0.14)
(21 more...)

Genre:

Research Report > New Finding (1.00)
Overview (0.68)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

Qian, Jing, Saligrama, Venkatesh

Spectral Clustering with Unbalanced Data

arXiv.org Machine LearningFeb-20-2013

Spectral clustering (SC) and graph-based semi-supervised learning (SSL) algorithms are sensitive to how graphs are constructed from data. In particular if the data has proximal and unbalanced clusters these algorithms can lead to poor performance on well-known graphs such as $k$-NN, full-RBF, $\epsilon$-graphs. This is because the objectives such as Ratio-Cut (RCut) or normalized cut (NCut) attempt to tradeoff cut values with cluster sizes, which are not tailored to unbalanced data. We propose a novel graph partitioning framework, which parameterizes a family of graphs by adaptively modulating node degrees in a $k$-NN graph. We then propose a model selection scheme to choose sizable clusters which are separated by smallest cut values. Our framework is able to adapt to varying levels of unbalancedness of data and can be naturally used for small cluster detection. We theoretically justify our ideas through limit cut analysis. Unsupervised and semi-supervised experiments on synthetic and real data sets demonstrate the superiority of our method.

artificial intelligence, graph, machine learning, (17 more...)

1302.5134

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.67)

Rippel, Oren, Adams, Ryan Prescott

High-Dimensional Probability Estimation with Deep Density Models

arXiv.org Machine LearningFeb-20-2013

One of the fundamental problems in machine learning is the estimation of a probability distribution from data. Many techniques have been proposed to study the structure of data, most often building around the assumption that observations lie on a lower-dimensional manifold of high probability. It has been more difficult, however, to exploit this insight to build explicit, tractable density models for high-dimensional data. In this paper, we introduce the deep density model (DDM), a new approach to density estimation. We exploit insights from deep learning to construct a bijective map to a representation space, under which the transformation of the distribution of the data is approximately factorized and has identical and known marginal densities. The simplicity of the latent distribution under the model allows us to feasibly explore it, and the invertibility of the map to characterize contraction of measure across it. This enables us to compute normalized densities for out-of-sample data. This combination of tractability and flexibility allows us to tackle a variety of probabilistic tasks on high-dimensional datasets, including: rapid computation of normalized densities at test-time without evaluating a partition function; generation of samples without MCMC; and characterization of the joint entropy of the data.

artificial intelligence, deep learning, machine learning, (14 more...)

1302.5125

Country: North America (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

arXiv.org Machine LearningFeb-20-2013

Breaking the Small Cluster Barrier of Graph Clustering

Ailon, Nir, Chen, Yudong, Huan, Xu

This paper considers a classic problem in machine learning and theoretical computer science, namely graph clustering, i.e., given an undirected unweighted graph, partition the nodes into disjoint clusters, so that the density of edges within one cluster is higher than those across clusters. Graph clustering arises naturally in many application across science and engineering. Some prominent examples include community detection in social network Mishra et al. [2007], submarket identification in E-commerce and sponsored search Yahoo!-Inc [2009], and co-authorship analysis in analyzing document database Ester et al. [1995], among others. From a purely binary classification theoretical point of view, the edges of the graph are (noisy) labels of similarity or affinity between pairs of objects, and the concept class consists of clusterings of the objects (encoded graphically by identifying clusters with cliques). Many theoretical results in graph clustering [e.g., Boppana, 1987, Chen et al., 2012, McSherry, 2001] consider the planted partition model, in which the edges are generated randomly; see Section 1.1 for more details. While numerous different methods have been proposed, their performance guarantees all share the following manner - under certain condition of the density of edges (within clusters and across clusters), the proposed method succeeds to recover the correct clusters exactly if all clusters are larger than a threshold size, typically Ω( n).

artificial intelligence, machine learning, probability, (19 more...)

1302.4549

Country: Asia (0.28)

Genre: Research Report (1.00)

Industry: Information Technology > Services (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Chacón, José E., Duong, Tarn

Data-driven density derivative estimation, with applications to nonparametric clustering and bump hunting

arXiv.org Machine LearningFeb-19-2013

Important information concerning a multivariate data set, such as clusters and modal regions, is contained in the derivatives of the probability density function. Despite this importance, nonparametric estimation of higher order derivatives of the density functions have received only relatively scant attention. Kernel estimators of density functions are widely used as they exhibit excellent theoretical and practical properties, though their generalization to density derivatives has progressed more slowly due to the mathematical intractabilities encountered in the crucial problem of bandwidth (or smoothing parameter) selection. This paper presents the first fully automatic, data-based bandwidth selectors for multivariate kernel density derivative estimators. This is achieved by synthesizing recent advances in matrix analytic theory which allow mathematically and computationally tractable representations of higher order derivatives of multivariate vector valued functions. The theoretical asymptotic properties as well as the finite sample behaviour of the proposed selectors are studied. {In addition, we explore in detail the applications of the new data-driven methods for two other statistical problems: clustering and bump hunting. The introduced techniques are combined with the mean shift algorithm to develop novel automatic, nonparametric clustering procedures which are shown to outperform mixture-model cluster analysis and other recent nonparametric approaches in practice. Furthermore, the advantage of the use of smoothing parameters designed for density derivative estimation for feature significance analysis for bump hunting is illustrated with a real data example.

artificial intelligence, estimation, machine learning, (16 more...)

1204.616

Country:

Europe (0.93)
North America > United States (0.28)

Genre: Research Report (0.63)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.68)
Health & Medicine > Therapeutic Area (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Canas, Guillermo D., Poggio, Tomaso, Rosasco, Lorenzo

Learning Manifolds with K-Means and K-Flats

arXiv.org Machine LearningFeb-19-2013

Our study is broadly motivated by questions in high-dimensional learning. As is well known, learning in high dimensions is feasible only if the data distribution satisfies suitable prior assumptions. One such assumption is that the data distribution lies on, or is close to, a low-dimensional set embedded in a high dimensional space, for instance a low dimensional manifold. This latter assumption has proved to be useful in practice, as well as amenable to theoretical analysis, and it has led to a significant amount of recent work. Starting from [29, 40, 7], this set of ideas, broadly referred to as manifold learning, has been applied to a variety of problems from supervised [42] and semi-supervised learning [8], to clustering [45] and dimensionality reduction [7], to name a few. Interestingly, the problem of learning the manifold itself has received less attention: given samples from a d-manifold M embedded in some ambient space X, the problem is to learn a set that approximates M in a suitable sense. This problem has been considered in computational geometry, but in a setting in which typically the manifold is a hyper-surface in a low-dimensional space (e.g.

approximation, artificial intelligence, machine learning, (17 more...)

1209.1121

Country: North America > United States > Massachusetts (0.28)

Genre: Research Report (0.50)

Industry: Education (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Xu, Kevin S., Kliger, Mark, Hero, Alfred O. III

Adaptive Evolutionary Clustering

arXiv.org Machine LearningFeb-19-2013

In many practical applications of clustering, the objects to be clustered evolve over time, and a clustering result is desired at each time step. In such applications, evolutionary clustering typically outperforms traditional static clustering by producing clustering results that reflect long-term trends while being robust to short-term variations. Several evolutionary clustering algorithms have recently been proposed, often by adding a temporal smoothness penalty to the cost function of a static clustering method. In this paper, we introduce a different approach to evolutionary clustering by accurately tracking the time-varying proximities between objects followed by static clustering. We present an evolutionary clustering framework that adaptively estimates the optimal smoothing parameter using shrinkage estimation, a statistical approach that improves a naive estimate using additional information. The proposed framework can be used to extend a variety of static clustering algorithms, including hierarchical, k-means, and spectral clustering, into evolutionary clustering algorithms. Experiments on synthetic and real data sets indicate that the proposed framework outperforms static clustering and existing evolutionary clustering algorithms in many scenarios.

artificial intelligence, experiment, machine learning, (17 more...)

doi: 10.1007/s10618-012-0302-x

1104.199

Country: North America > United States > Michigan (0.28)

Genre: Research Report (0.81)

Industry:

Banking & Finance > Trading (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)