AITopics | Acharya, Jayadev

Collaborating Authors

Acharya, Jayadev

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Differentially Private Testing of Identity and Closeness of Discrete Distributions

Acharya, Jayadev, Sun, Ziteng, Zhang, Huanyu

Neural Information Processing SystemsDec-31-2018

We study the fundamental problems of identity testing (goodness of fit), and closeness testing (two sample test) of distributions over $k$ elements, under differential privacy. While the problems have a long history in statistics, finite sample bounds for these problems have only been established recently. In this work, we derive upper and lower bounds on the sample complexity of both the problems under $(\varepsilon, \delta)$-differential privacy. We provide optimal sample complexity algorithms for identity testing problem for all parameter ranges, and the first results for closeness testing. Our closeness testing bounds are optimal in the sparse regime where the number of samples is at most $k$. Our upper bounds are obtained by privatizing non-private estimators for these problems. The non-private estimators are chosen to have small sensitivity. We propose a general framework to establish lower bounds on the sample complexity of statistical tasks under differential privacy. We show a bound on differentially private algorithms in terms of a coupling between the two hypothesis classes we aim to test. By constructing carefully chosen priors over the hypothesis classes, and using Le Cam's two point theorem we provide a general mechanism for proving lower bounds. We believe that the framework can be used to obtain strong lower bounds for other statistical tasks under privacy.

algorithm, artificial intelligence, machine learning, (14 more...)

Neural Information Processing Systems

Country: North America > United States > New York (0.14)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.69)
Information Technology > Data Science (0.69)

Add feedback

Differentially Private Testing of Identity and Closeness of Discrete Distributions

Acharya, Jayadev, Sun, Ziteng, Zhang, Huanyu

Neural Information Processing SystemsDec-31-2018

algorithm, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States > New York (0.14)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.69)

Add feedback

Learning and Testing Causal Models with Interventions

Acharya, Jayadev, Bhattacharyya, Arnab, Daskalakis, Constantinos, Kandasamy, Saravanan

Neural Information Processing SystemsDec-31-2018

We consider testing and learning problems on causal Bayesian networks as defined by Pearl (Pearl, 2009). Given a causal Bayesian network M on a graph with n discrete variables and bounded in-degree and bounded ``confounded components'', we show that O(log n) interventions on an unknown causal Bayesian network X on the same graph, and O(n/epsilon^2) samples per intervention, suffice to efficiently distinguish whether X=M or whether there exists some intervention under which X and M are farther than epsilon in total variation distance. We also obtain sample/time/intervention efficient algorithms for: (i) testing the identity of two unknown causal Bayesian networks on the same graph; and (ii) learning a causal Bayesian network on a given graph. Although our algorithms are non-adaptive, we show that adaptivity does not help in general: Omega(log n) interventions are necessary for testing the identity of two unknown causal Bayesian networks on the same graph, even adaptively. Our algorithms are enabled by a new subadditivity inequality for the squared Hellinger distance between two causal Bayesian networks.

bayesian inference, health & medicine, intervention, (17 more...)

Neural Information Processing Systems

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Learning and Testing Causal Models with Interventions

Acharya, Jayadev, Bhattacharyya, Arnab, Daskalakis, Constantinos, Kandasamy, Saravanan

arXiv.org Artificial IntelligenceMay-24-2018

We consider testing and learning problems on causal Bayesian networks as defined by Pearl (Pearl, 2009). Given a causal Bayesian network $\mathcal{M}$ on a graph with $n$ discrete variables and bounded in-degree and bounded `confounded components', we show that $O(\log n)$ interventions on an unknown causal Bayesian network $\mathcal{X}$ on the same graph, and $\tilde{O}(n/\epsilon^2)$ samples per intervention, suffice to efficiently distinguish whether $\mathcal{X}=\mathcal{M}$ or whether there exists some intervention under which $\mathcal{X}$ and $\mathcal{M}$ are farther than $\epsilon$ in total variation distance. We also obtain sample/time/intervention efficient algorithms for: (i) testing the identity of two unknown causal Bayesian networks on the same graph; and (ii) learning a causal Bayesian network on a given graph. Although our algorithms are non-adaptive, we show that adaptivity does not help in general: $\Omega(\log n)$ interventions are necessary for testing the identity of two unknown causal Bayesian networks on the same graph, even adaptively. Our algorithms are enabled by a new subadditivity inequality for the squared Hellinger distance between two causal Bayesian networks.

bayesian inference, health & medicine, intervention, (19 more...)

arXiv.org Artificial Intelligence

1805.09697

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.82)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Optimal Testing for Properties of Distributions

Acharya, Jayadev, Daskalakis, Constantinos, Kamath, Gautam

Neural Information Processing SystemsDec-31-2015

Given samples from an unknown distribution, p, is it possible to distinguish whether p belongs to some class of distributions C versus p being far from every distribution in C? This fundamental question has receivedtremendous attention in Statistics, albeit focusing onasymptotic analysis, as well as in Computer Science, wherethe emphasis has been on small sample size and computationalcomplexity. Nevertheless, even for basic classes ofdistributions such as monotone, log-concave, unimodal, and monotone hazard rate, the optimal sample complexity is unknown.We provide a general approach via which we obtain sample-optimal and computationally efficient testers for all these distribution families. At the core of our approach is an algorithm which solves the following problem:Given samplesfrom an unknown distribution p, and a known distribution q, are p and q close in Chi^2-distance, or far in total variation distance?The optimality of all testers is established by providing matching lower bounds. Finally, a necessary building block for our tester and important byproduct of our work are the first known computationally efficient proper learners for discretelog-concave and monotone hazard rate distributions. We exhibit the efficacy of our testers via experimental analysis.

algorithm, artificial intelligence, survey article, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

Add feedback

Near-Optimal-Sample Estimators for Spherical Gaussian Mixtures

Suresh, Ananda Theertha, Orlitsky, Alon, Acharya, Jayadev, Jafarpour, Ashkan

Neural Information Processing SystemsDec-31-2014

Many important distributions are high dimensional, and often they can be modeled as Gaussian mixtures. We derive the first sample-efficient polynomial-time estimator for high-dimensional spherical Gaussian mixtures. Based on intuitive spectral reasoning, it approximates mixtures of $k$ spherical Gaussians in $d$-dimensions to within$\ell_1$ distance $\epsilon$ using $\mathcal{O}({dk^9(\log^2 d)}/{\epsilon^4})$ samples and $\mathcal{O}_{k,\epsilon}(d^3\log^5 d)$ computation time. Conversely, we show that any estimator requires $\Omega\bigl({dk}/{\epsilon^2}\bigr)$ samples, hence the algorithm's sample complexity is nearly optimal in the dimension. The implied time-complexity factor \mathcal{O}_{k,\epsilon}$ is exponential in $k$, but much smaller than previously known. We also construct a simple estimator for one-dimensional Gaussian mixtures that uses $\tilde\mathcal{O}(k /\epsilon^2)$ samples and $\tilde\mathcal{O}((k/\epsilon)^{3k+1})$ computation time.

algorithm, artificial intelligence, health & medicine, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

Near-optimal-sample estimators for spherical Gaussian mixtures

Acharya, Jayadev, Jafarpour, Ashkan, Orlitsky, Alon, Suresh, Ananda Theertha

arXiv.org Machine LearningFeb-19-2014

Statistical and machine-learning algorithms are frequently applied to high-dimensional data. In many of these applications data is scarce, and often much more costly than computation time. We provide the first sample-efficient polynomial-time estimator for high-dimensional spherical Gaussian mixtures. For mixtures of any $k$ $d$-dimensional spherical Gaussians, we derive an intuitive spectral-estimator that uses $\mathcal{O}_k\bigl(\frac{d\log^2d}{\epsilon^4}\bigr)$ samples and runs in time $\mathcal{O}_{k,\epsilon}(d^3\log^5 d)$, both significantly lower than previously known. The constant factor $\mathcal{O}_k$ is polynomial for sample complexity and is exponential for the time complexity, again much smaller than what was previously known. We also show that $\Omega_k\bigl(\frac{d}{\epsilon^2}\bigr)$ samples are needed for any algorithm. Hence the sample complexity is near-optimal in the number of dimensions. We also derive a simple estimator for one-dimensional mixtures that uses $\mathcal{O}\bigl(\frac{k \log \frac{k}{\epsilon} }{\epsilon^2} \bigr)$ samples and runs in time $\widetilde{\mathcal{O}}\left(\bigl(\frac{k}{\epsilon}\bigr)^{3k+1}\right)$. Our other technical contributions include a faster algorithm for choosing a density estimate from a set of distributions, that minimizes the $\ell_1$ distance to an unknown underlying distribution.

algorithm, artificial intelligence, health & medicine, (19 more...)

arXiv.org Machine Learning

1402.4746

Country: North America > United States > California (0.14)

Genre:

Research Report (0.50)
Workflow (0.46)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Tight Bounds on Profile Redundancy and Distinguishability

Acharya, Jayadev, Das, Hirakendu, Orlitsky, Alon

Neural Information Processing SystemsDec-31-2012

The minimax KL-divergence of any distribution from all distributions in a collection P has several practical implications. In compression, it is called redundancy and represents the least additional number of bits over the entropy needed to encode the output of any distribution in P. In online estimation andlearning, it is the lowest expected log-loss regret when guessing a sequence of random values generated by a distribution in P. In hypothesis testing, it upper bounds the largest number of distinguishable distributions in P. Motivated by problems ranging from population estimation to text classification and speech recognition, several machine-learning and information-theory researchers have recently considered label-invariant observations and properties induced by i.i.d.

artificial intelligence, machine learning, redundancy, (16 more...)

Neural Information Processing Systems

Country: North America > United States > New Jersey (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback