AITopics | Rinaldo, Alessandro

Collaborating Authors

Rinaldo, Alessandro

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Statistical Inference for Cluster Trees

KIM, Jisu, Chen, Yen-Chi, Balakrishnan, Sivaraman, Rinaldo, Alessandro, Wasserman, Larry

Neural Information Processing SystemsDec-31-2016

A cluster tree provides an intuitive summary of a density function that reveals essential structure about the high-density clusters. The true cluster tree is estimated from a finite sample from an unknown true density. This paper addresses the basic question of quantifying our uncertainty by assessing the statistical significance of different features of an empirical cluster tree. We first study a variety of metrics that can be used to compare different trees, analyzing their properties and assessing their suitability for our inference task. We then propose methods to construct and summarize confidence sets for the unknown true cluster tree. We introduce a partial ordering on cluster trees which we use to prune some of the statistically insignificant features of the empirical tree, yielding interpretable and parsimonious cluster trees. Finally, we provide a variety of simulations to illustrate our proposed methods and furthermore demonstrate their utility in the analysis of a Graft-versus-Host Disease (GvHD) data set.

artificial intelligence, cluster tree, health & medicine, (15 more...)

Neural Information Processing Systems

Country: North America > United States (0.47)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

Add feedback

Statistical Analysis of Persistence Intensity Functions

Chen, Yen-Chi, Wang, Daren, Rinaldo, Alessandro, Wasserman, Larry

arXiv.org Machine LearningOct-8-2015

Persistence diagrams are two-dimensional plots that summarize the topological features of functions and are an important part of topological data analysis. A problem that has received much attention is how deal with sets of persistence diagrams. How do we summarize them, average them or cluster them? One approach -- the persistence intensity function -- was introduced informally by Edelsbrunner, Ivanov, and Karasev (2012). Here we provide a modification and formalization of this approach. Using the persistence intensity function, we can visualize multiple diagrams, perform clustering and conduct two-sample tests.

artificial intelligence, diagram, machine learning, (15 more...)

arXiv.org Machine Learning

1510.02502

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Consistency of spectral clustering in stochastic block models

Lei, Jing, Rinaldo, Alessandro

arXiv.org Machine LearningDec-30-2014

We analyze the performance of spectral clustering for community extraction in stochastic block models. We show that, under mild conditions, spectral clustering applied to the adjacency matrix of the network can consistently recover hidden communities even when the order of the maximum expected degree is as small as $\log n$, with $n$ the number of nodes. This result applies to some popular polynomial time spectral clustering algorithms and is further extended to degree corrected stochastic block models using a spherical $k$-median spectral clustering method. A key component of our analysis is a combinatorial bound on the spectrum of binary random matrices, which is sharper than the conventional matrix Bernstein inequality and may be of independent interest.

artificial intelligence, machine learning, spectral, (16 more...)

arXiv.org Machine Learning

doi: 10.1214/14-AOS1274

1312.205

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.86)

Add feedback

Statistical Models for Degree Distributions of Networks

Sadeghi, Kayvan, Rinaldo, Alessandro

arXiv.org Machine LearningNov-14-2014

We define and study the statistical models in exponential family form whose sufficient statistics are the degree distributions and the bi-degree distributions of undirected labelled simple graphs. Graphs that are constrained by the joint degree distributions are called $dK$-graphs in the computer science literature and this paper attempts to provide the first statistically grounded analysis of this type of models. In addition to formalizing these models, we provide some preliminary results for the parameter estimation and the asymptotic behaviour of the model for degree distribution, and discuss the parameter estimation for the model for bi-degree distribution.

artificial intelligence, machine learning, node, (16 more...)

arXiv.org Machine Learning

1411.3825

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.54)

Add feedback

Cluster Trees on Manifolds

Balakrishnan, Sivaraman, Narayanan, Srivatsan, Rinaldo, Alessandro, Singh, Aarti, Wasserman, Larry

Neural Information Processing SystemsDec-31-2013

We investigate the problem of estimating the cluster tree for a density $f$ supported on or near a smooth $d$-dimensional manifold $M$ isometrically embedded in $\mathbb{R}^D$. We study a $k$-nearest neighbor based algorithm recently proposed by Chaudhuri and Dasgupta. Under mild assumptions on $f$ and $M$, we obtain rates of convergence that depend on $d$ only but not on the ambient dimension $D$. We also provide a sample complexity lower bound for a natural class of clustering algorithms that use $D$-dimensional neighborhoods.

algorithm, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.49)

Add feedback

Estimating Undirected Graphs Under Weak Assumptions

Wasserman, Larry, Kolar, Mladen, Rinaldo, Alessandro

arXiv.org Machine LearningSep-26-2013

We consider the problem of providing nonparametric confidence guarantees for undirected graphs under weak assumptions. In particular, we do not assume sparsity, incoherence or Normality. We allow the dimension $D$ to increase with the sample size $n$. First, we prove lower bounds that show that if we want accurate inferences with low assumptions then there are limitations on the dimension as a function of sample size. When the dimension increases slowly with sample size, we show that methods based on Normal approximations and on the bootstrap lead to valid inferences and we provide Berry-Esseen bounds on the accuracy of the Normal approximation. When the dimension is large relative to sample size, accurate inferences for graphs under low assumptions are not possible. Instead we propose to estimate something less demanding than the entire partial correlation graph. In particular, we consider: cluster graphs, restricted partial correlation graphs and correlation graphs.

artificial intelligence, graph, health & medicine, (19 more...)

arXiv.org Machine Learning

1309.6933

Country: North America > United States (0.28)

Genre: Research Report (0.82)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

DeBaCl: A Python Package for Interactive DEnsity-BAsed CLustering

Kent, Brian P., Rinaldo, Alessandro, Verstynen, Timothy

arXiv.org Machine LearningJul-30-2013

The level set tree approach of Hartigan (1975) provides a probabilistically based and highly interpretable encoding of the clustering behavior of a dataset. By representing the hierarchy of data modes as a dendrogram of the level sets of a density estimator, this approach offers many advantages for exploratory analysis and clustering, especially for complex and high-dimensional data. Several R packages exist for level set tree estimation, but their practical usefulness is limited by computational inefficiency, absence of interactive graphical capabilities and, from a theoretical perspective, reliance on asymptotic approximations. To make it easier for practitioners to capture the advantages of level set trees, we have written the Python package DeBaCl for DEnsity-BAsed CLustering. In this article we illustrate how DeBaCl's level set tree estimates can be used for difficult clustering tasks and interactive graphical data analysis. The package is intended to promote the practical use of level set trees through improvements in computational efficiency and a high degree of user customization. In addition, the flexible algorithms implemented in DeBaCl enjoy finite sample accuracy, as demonstrated in recent literature on density clustering. Finally, we show the level set tree framework can be easily extended to deal with functional data.

artificial intelligence, debacl, machine learning, (16 more...)

arXiv.org Machine Learning

1307.8136

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre: Research Report (0.64)

Industry: Government (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Tight Lower Bounds for Homology Inference

Balakrishnan, Sivaraman, Rinaldo, Alessandro, Singh, Aarti, Wasserman, Larry

arXiv.org Machine LearningJul-29-2013

The homology groups of a manifold are important topological invariants that provide an algebraic summary of the manifold. These groups contain rich topological information, for instance, about the connected components, holes, tunnels and sometimes the dimension of the manifold. In earlier work, we have considered the statistical problem of estimating the homology of a manifold from noiseless samples and from noisy samples under several different noise models. We derived upper and lower bounds on the minimax risk for this problem. In this note we revisit the noiseless case. In previous work we used Le Cam's lemma to establish a lower bound that differed from the upper bound of Niyogi, Smale and Weinberger by a polynomial factor in the condition number. In this note we use a different construction based on the direct analysis of the likelihood ratio test to show that the upper bound of Niyogi, Smale and Weinberger is in fact tight, thus establishing rate optimal asymptotic minimax bounds for the problem. The techniques we use here extend in a straightforward way to the noisy settings considered in our earlier work.

artificial intelligence, manifold, minimax risk, (13 more...)

arXiv.org Machine Learning

1307.7666

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.17)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

Cluster Trees on Manifolds

Balakrishnan, Sivaraman, Narayanan, Srivatsan, Rinaldo, Alessandro, Singh, Aarti, Wasserman, Larry

arXiv.org Machine LearningJul-24-2013

In this paper we investigate the problem of estimating the cluster tree for a density $f$ supported on or near a smooth $d$-dimensional manifold $M$ isometrically embedded in $\mathbb{R}^D$. We analyze a modified version of a $k$-nearest neighbor based algorithm recently proposed by Chaudhuri and Dasgupta. The main results of this paper show that under mild assumptions on $f$ and $M$, we obtain rates of convergence that depend on $d$ only but not on the ambient dimension $D$. We also show that similar (albeit non-algorithmic) results can be obtained for kernel density estimators. We sketch a construction of a sample complexity lower bound instance for a natural class of manifold oblivious clustering algorithms. We further briefly consider the known manifold case and show that in this case a spatially adaptive algorithm achieves better rates.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

1307.6515

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.48)

Add feedback

Recovering Block-structured Activations Using Compressive Measurements

Balakrishnan, Sivaraman, Kolar, Mladen, Rinaldo, Alessandro, Singh, Aarti

arXiv.org Machine LearningJun-24-2013

We consider the problems of detection and localization of a contiguous block of weak activation in a large matrix, from a small number of noisy, possibly adaptive, compressive (linear) measurements. This is closely related to the problem of compressed sensing, where the task is to estimate a sparse vector using a small number of linear measurements. Contrary to results in compressed sensing, where it has been shown that neither adaptivity nor contiguous structure help much, we show that for reliable localization the magnitude of the weakest signals is strongly influenced by both structure and the ability to choose measurements adaptively while for detection neither adaptivity nor structure reduce the requirement on the magnitude of the signal. We characterize the precise tradeoffs between the various problem parameters, the signal strength and the number of measurements required to reliably detect and localize the block of activation. The sufficient conditions are complemented with information theoretic lower bounds.

artificial intelligence, localization, machine learning, (19 more...)

arXiv.org Machine Learning

1209.3431

Country: Asia > Middle East (0.14)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback