AITopics | Rinaldo, Alessandro

Plotting

Rinaldo, Alessandro

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Conformal Prediction Approach to Explore Functional Data

Lei, Jing, Rinaldo, Alessandro, Wasserman, Larry

arXiv.org Machine LearningFeb-26-2013

This paper applies conformal prediction techniques to compute simultaneous prediction bands and clustering trees for functional data. These tools can be used to detect outliers and clusters. Both our prediction bands and clustering trees provide prediction sets for the underlying stochastic process with a guaranteed finite sample behavior, under no distributional assumptions. The prediction sets are also informative in that they correspond to the high density region of the underlying process. While ordinary conformal prediction has high computational cost for functional data, we use the inductive conformal predictor, together with several novel choices of conformity scores, to simplify the computation. Our methods are illustrated on some real data examples.

artificial intelligence, functional data, machine learning, (16 more...)

arXiv.org Machine Learning

1302.6452

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback

Distribution-Free Distribution Regression

Poczos, Barnabas, Rinaldo, Alessandro, Singh, Aarti, Wasserman, Larry

arXiv.org Machine LearningFeb-1-2013

`Distribution regression' refers to the situation where a response Y depends on a covariate P where P is a probability distribution. The model is Y=f(P) + mu where f is an unknown regression function and mu is a random error. Typically, we do not observe P directly, but rather, we observe a sample from P. In this paper we develop theory and methods for distribution-free versions of distribution regression. This means that we do not make distributional assumptions about the error term mu and covariate P. We prove that when the effective dimension is small enough (as measured by the doubling dimension), then the excess prediction risk converges to zero with a polynomial rate.

artificial intelligence, estimator, machine learning, (15 more...)

arXiv.org Machine Learning

1302.0082

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Add feedback

Changepoint Detection over Graphs with the Spectral Scan Statistic

Sharpnack, James, Rinaldo, Alessandro, Singh, Aarti

arXiv.org Machine LearningJun-4-2012

We consider the change-point detection problem of deciding, based on noisy measurements, whether an unknown signal over a given graph is constant or is instead piecewise constant over two connected induced subgraphs of relatively low cut size. We analyze the corresponding generalized likelihood ratio (GLR) statistics and relate it to the problem of finding a sparsest cut in a graph. We develop a tractable relaxation of the GLR statistic based on the combinatorial Laplacian of the graph, which we call the spectral scan statistic, and analyze its properties. We show how its performance as a testing procedure depends directly on the spectrum of the graph, and use this result to explicitly derive its asymptotic properties on few significant graph topologies. Finally, we demonstrate both theoretically and by simulations that the spectral scan statistic can outperform naive testing procedures based on edge thresholding and $\chi^2$ testing.

artificial intelligence, graph, health & medicine, (15 more...)

arXiv.org Machine Learning

1206.0773

Country: North America > United States (0.14)

Genre: Research Report (0.64)

Industry:

Information Technology (0.68)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Communications > Networks (0.68)

Add feedback

Differential Privacy for Functions and Functional Data

Hall, Rob, Rinaldo, Alessandro, Wasserman, Larry

arXiv.org Machine LearningMar-12-2012

Differential privacy is a framework for privately releasing summaries of a database. Previous work has focused mainly on methods for which the output is a finite dimensional vector, or an element of some discrete set. We develop methods for releasing functions while preserving differential privacy. Specifically, we show that adding an appropriate Gaussian process to the function of interest yields differential privacy. When the functions lie in the same RKHS as the Gaussian process, then the correct noise level is established by measuring the "sensitivity" of the function in the RKHS norm. As examples we consider kernel density estimation, kernel support vector machines, and functions in reproducing kernel Hilbert spaces.

artificial intelligence, differential privacy, survey article, (16 more...)

arXiv.org Machine Learning

1203.257

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (0.49)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.54)

Add feedback

Minimax Localization of Structural Information in Large Noisy Matrices

Kolar, Mladen, Balakrishnan, Sivaraman, Rinaldo, Alessandro, Singh, Aarti

Neural Information Processing SystemsDec-31-2011

We consider the problem of identifying a sparse set of relevant columns and rows in a large data matrix with highly corrupted entries. This problem of identifying groups from a collection of bipartite variables such as proteins and drugs, biological species and gene sequences, malware and signatures, etc is commonly referred to as biclustering or co-clustering. Despite its great practical relevance, and although several ad-hoc methods are available for biclustering, theoretical analysis of the problem is largely non-existent. The problem we consider is also closely related to structured multiple hypothesis testing, an area of statistics that has recently witnessed a flurry of activity. We make the following contributions: i) We prove lower bounds on the minimum signal strength needed for successful recovery of a bicluster as a function of the noise variance, size of the matrix and bicluster of interest. ii) We show that a combinatorial procedure based on the scan statistic achieves this optimal limit. iii) We characterize the SNR required by several computationally tractable procedures for biclustering including element-wise thresholding, column/row average thresholding and a convex relaxation approach to sparse singular vector decomposition.

artificial intelligence, bicluster, health & medicine, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.85)

Add feedback

Minimax Rates for Homology Inference

Balakrishnan, Sivaraman, Rinaldo, Alessandro, Sheehy, Don, Singh, Aarti, Wasserman, Larry

arXiv.org Machine LearningDec-23-2011

Often, high dimensional data lie close to a low-dimensional submanifold and it is of interest to understand the geometry of these submanifolds. The homology groups of a manifold are important topological invariants that provide an algebraic summary of the manifold. These groups contain rich topological information, for instance, about the connected components, holes, tunnels and sometimes the dimension of the manifold. In this paper, we consider the statistical problem of estimating the homology of a manifold from noisy samples under several different noise models. We derive upper and lower bounds on the minimax risk for this problem. Our upper bounds are based on estimators which are constructed from a union of balls of appropriate radius around carefully selected points. In each case we establish complementary lower bounds using Le Cam's lemma.

artificial intelligence, health & medicine, homology, (19 more...)

arXiv.org Machine Learning

1112.5627

Country: Europe (0.14)

Genre: Research Report (0.50)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.62)

Add feedback

Stability of Density-Based Clustering

Rinaldo, Alessandro, Singh, Aarti, Nugent, Rebecca, Wasserman, Larry

arXiv.org Machine LearningNov-11-2010

High density clusters can be characterized by the connected components of a level set $L(\lambda) = \{x:\ p(x)>\lambda\}$ of the underlying probability density function $p$ generating the data, at some appropriate level $\lambda\geq 0$. The complete hierarchical clustering can be characterized by a cluster tree ${\cal T}= \bigcup_{\lambda} L(\lambda)$. In this paper, we study the behavior of a density level set estimate $\widehat L(\lambda)$ and cluster tree estimate $\widehat{\cal{T}}$ based on a kernel density estimator with kernel bandwidth $h$. We define two notions of instability to measure the variability of $\widehat L(\lambda)$ and $\widehat{\cal{T}}$ as a function of $h$, and investigate the theoretical properties of these instability measures.

artificial intelligence, instability, machine learning, (14 more...)

arXiv.org Machine Learning

1011.2771

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > New York (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

On the Geometry of Discrete Exponential Families with Application to Exponential Random Graph Models

Fienberg, Stephen E., Rinaldo, Alessandro, Zhou, Yi

arXiv.org Machine LearningDec-30-2008

There has been an explosion of interest in statistical models for analyzing network data, and considerable interest in the class of exponential random graph (ERG) models, especially in connection with difficulties in computing maximum likelihood estimates. The issues associated with these difficulties relate to the broader structure of discrete exponential families. This paper re-examines the issues in two parts. First we consider the closure of $k$-dimensional exponential families of distribution with discrete base measure and polyhedral convex support $\mathrm{P}$. We show that the normal fan of $\mathrm{P}$ is a geometric object that plays a fundamental role in deriving the statistical and geometric properties of the corresponding extended exponential families. We discuss its relevance to maximum likelihood estimation, both from a theoretical and computational standpoint. Second, we apply our results to the analysis of ERG models. In particular, by means of a detailed example, we provide some characterization of the properties of ERG models, and, in particular, of certain behaviors of ERG models known as degeneracy.

artificial intelligence, erg model, télécommunications, (18 more...)

arXiv.org Machine Learning

0901.0026

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Networks (0.48)
Telecommunications > Networks (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.75)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.75)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback