AITopics | Dimakis, Alexandros G.

Collaborating Authors

Dimakis, Alexandros G.

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Experimental Design for Cost-Aware Learning of Causal Graphs

Lindgren, Erik, Kocaoglu, Murat, Dimakis, Alexandros G., Vishwanath, Sriram

Neural Information Processing SystemsDec-31-2018

We consider the minimum cost intervention design problem: Given the essential graph of a causal graph and a cost to intervene on a variable, identify the set of interventions with minimum total cost that can learn any causal graph with the given essential graph. We first show that this problem is NP-hard. We then prove that we can achieve a constant factor approximation to this problem with a greedy algorithm. We then constrain the sparsity of each intervention. We develop an algorithm that returns an intervention design that is nearly optimal in terms of size for sparse graphs with sparse interventions and we discuss how to use it when there are costs on the vertices.

artificial intelligence, graph, health & medicine, (19 more...)

Neural Information Processing Systems

Country: North America > United States > Texas (0.14)

Genre: Research Report (0.83)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.51)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Entropic Latent Variable Discovery

Kocaoglu, Murat, Shakkottai, Sanjay, Dimakis, Alexandros G., Caramanis, Constantine, Vishwanath, Sriram

arXiv.org Artificial IntelligenceJul-26-2018

We consider the problem of discovering the simplest latent variable that can make two observed discrete variables conditionally independent. This problem has appeared in the literature as probabilistic latent semantic analysis (pLSA), and has connections to non-negative matrix factorization. When the simplicity of the variable is measured through its cardinality, we show that a solution to this latent variable discovery problem can be used to distinguish direct causal relations from spurious correlations among almost all joint distributions on simple causal graphs with two observed variables. Conjecturing a similar identifiability result holds with Shannon entropy, we study a loss function that trades-off between entropy of the latent variable and the conditional mutual information of the observed variables. We then propose a latent variable discovery algorithm -- LatentSearch -- and show that its stationary points are the stationary points of our loss function. We experimentally show that LatentSearch can indeed be used to distinguish direct causal relations from spurious correlations.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Artificial Intelligence

1807.10399

Country: North America > United States (0.67)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.94)

Add feedback

The Sparse Recovery Autoencoder

Wu, Shanshan, Dimakis, Alexandros G., Sanghavi, Sujay, Yu, Felix X., Holtmann-Rice, Daniel, Storcheus, Dmitry, Rostamizadeh, Afshin, Kumar, Sanjiv

arXiv.org Machine LearningJul-5-2018

Linear encoding of sparse vectors is widely popular, but is most commonly data-independent -- missing any possible extra (but a-priori unknown) structure beyond sparsity. In this paper we present a new method to learn linear encoders that adapt to data, while still performing well with the widely used $\ell_1$ decoder. The convex $\ell_1$ decoder prevents gradient propagation as needed in standard autoencoder training. Our method is based on the insight that unfolding the convex decoder into $T$ projected gradient steps can address this issue. Our method can be seen as a data-driven way to learn a compressed sensing matrix. Our experiments show that there is indeed additional structure beyond sparsity in several real datasets. Our autoencoder is able to discover it and exploit it to create excellent reconstructions with fewer measurements compared to the previous state of the art methods.

deep learning, matrix, neural network, (20 more...)

arXiv.org Machine Learning

1806.10175

Country: North America > United States (0.93)

Genre: Research Report > Promising Solution (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Compressed Sensing with Deep Image Prior and Learned Regularization

Van Veen, David, Jalal, Ajil, Price, Eric, Vishwanath, Sriram, Dimakis, Alexandros G.

arXiv.org Machine LearningJun-17-2018

We propose a novel method for compressed sensing recovery using untrained deep generative models. Our method is based on the recently proposed Deep Image Prior (DIP), wherein the convolutional weights of the network are optimized to match the observed measurements. We show that this approach can be applied to solve any differentiable inverse problem. We also introduce a novel learned regularization technique which incorporates a small amount of prior information, further reducing the number of measurements required for a given reconstruction error. Our algorithm requires approximately 4-6x fewer measurements than classical Lasso methods. Unlike previous approaches based on generative models, our method does not require the model to be pre-trained. As such, we can apply our method to various medical imaging datasets for which data acquisition is expensive and no known generative models exist.

algorithm, deep learning, neural network, (19 more...)

arXiv.org Machine Learning

1806.06438

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

Streaming Weak Submodularity: Interpreting Neural Networks on the Fly

Elenberg, Ethan, Dimakis, Alexandros G., Feldman, Moran, Karbasi, Amin

Neural Information Processing SystemsDec-31-2017

In many machine learning applications, it is important to explain the predictions of a black-box classifier. For example, why does a deep neural network assign an image to a particular class? We cast interpretability of black-box classifiers as a combinatorial maximization problem and propose an efficient streaming algorithm to solve it subject to cardinality constraints. By extending ideas from Badanidiyuru et al. [2014], we provide a constant factor approximation guarantee for our algorithm in the case of random stream order and a weakly submodular objective function. This is the first such theoretical guarantee for this general class of functions, and we also show that no such algorithm exists for a worst case stream order.

algorithm, neural network, optimization problem, (17 more...)

Neural Information Processing Systems

Country: North America > United States > Texas (0.14)

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Model-Powered Conditional Independence Test

Sen, Rajat, Suresh, Ananda Theertha, Shanmugam, Karthikeyan, Dimakis, Alexandros G., Shakkottai, Sanjay

Neural Information Processing SystemsDec-31-2017

We consider the problem of non-parametric Conditional Independence testing (CI testing) for continuous random variables. Given i.i.d samples from the joint distribution $f(x,y,z)$ of continuous random vectors $X,Y$ and $Z,$ we determine whether $X \independent Y \vert Z$. We approach this by converting the conditional independence test into a classification problem. This allows us to harness very powerful classifiers like gradient-boosted trees and deep neural networks. These models can handle complex probability distributions and allow us to perform significantly better compared to the prior state of the art, for high-dimensional CI testing. The main technical challenge in the classification problem is the need for samples from the conditional product distribution $f^{CI}(x,y,z) = f(x|z)f(y|z)f(z)$ -- the joint distribution if and only if $X \independent Y \vert Z.$ -- when given access only to i.i.d. samples from the true joint distribution $f(x,y,z)$. To tackle this problem we propose a novel nearest neighbor bootstrap procedure and theoretically show that our generated samples are indeed close to $f^{CI}$ in terms of total variational distance. We then develop theoretical results regarding the generalization bounds for classification for our problem, which translate into error bounds for CI testing. We provide a novel analysis of Rademacher type classification bounds in the presence of non-i.i.d \textit{near-independent} samples. We empirically validate the performance of our algorithm on simulated and real datasets and show performance gains over previous methods.

algorithm, renewable energy, survey article, (20 more...)

Neural Information Processing Systems

Country: North America > United States > Texas (0.14)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Add feedback

The Robust Manifold Defense: Adversarial Training using Generative Models

Ilyas, Andrew, Jalal, Ajil, Asteri, Eirini, Daskalakis, Constantinos, Dimakis, Alexandros G.

arXiv.org Machine LearningDec-26-2017

Deep neural networks are demonstrating excellent performance on several classical vision problems. However, these networks are vulnerable to adversarial examples, minutely modified images that induce arbitrary attacker-chosen output from the network. We propose a mechanism to protect against these adversarial inputs based on a generative model of the data. We introduce a pre-processing step that projects on the range of a generative model using gradient descent before feeding an input into a classifier. We show that this step provides the classifier with robustness against first-order, substitute model, and combined adversarial attacks. Using a min-max formulation, we show that there may exist adversarial examples even in the range of the generator, natural-looking images extremely close to the decision boundary for which the classifier has unjustifiedly high confidence. We show that adversarial training on the generative manifold can be used to make a classifier that is robust to these attacks. Finally, we show how our method can be applied even without a pre-trained generative model using a recent method called the deep image prior. We evaluate our method on MNIST, CelebA and Imagenet and show robustness against the current state of the art attacks.

classifier, deep learning, neural network, (18 more...)

arXiv.org Machine Learning

1712.09196

Country: North America > United States (0.14)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)

Add feedback

Streaming Weak Submodularity: Interpreting Neural Networks on the Fly

Elenberg, Ethan R., Dimakis, Alexandros G., Feldman, Moran, Karbasi, Amin

arXiv.org Machine LearningNov-22-2017

algorithm, neural network, optimization problem, (16 more...)

arXiv.org Machine Learning

1703.02647

Country: North America > United States > Texas (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Restricted Strong Convexity Implies Weak Submodularity

Elenberg, Ethan R., Khanna, Rajiv, Dimakis, Alexandros G., Negahban, Sahand

arXiv.org Machine LearningOct-12-2017

We connect high-dimensional subset selection and submodular maximization. Our results extend the work of Das and Kempe (2011) from the setting of linear regression to arbitrary objective functions. For greedy feature selection, this connection allows us to obtain strong multiplicative performance bounds on several methods without statistical modeling assumptions. We also derive recovery guarantees of this form under standard assumptions. Our work shows that greedy algorithms perform within a constant factor from the best possible subset-selection solution for a broad class of general objective functions. Our methods allow a direct control over the number of obtained features as opposed to regularization parameters that only implicitly control sparsity. Our proof technique uses the concept of weak submodularity initially defined by Das and Kempe. We draw a connection between convex analysis and submodular set function theory which may be of independent interest for other statistical learning applications that have combinatorial structure.

algorithm, artificial intelligence, optimization problem, (14 more...)

arXiv.org Machine Learning

1612.00804

Country: North America > United States > Texas (0.14)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.91)

Add feedback

Model-Powered Conditional Independence Test

Sen, Rajat, Suresh, Ananda Theertha, Shanmugam, Karthikeyan, Dimakis, Alexandros G., Shakkottai, Sanjay

arXiv.org Machine LearningSep-18-2017

We consider the problem of non-parametric Conditional Independence testing (CI testing) for continuous random variables. Given i.i.d samples from the joint distribution $f(x,y,z)$ of continuous random vectors $X,Y$ and $Z,$ we determine whether $X \perp Y | Z$. We approach this by converting the conditional independence test into a classification problem. This allows us to harness very powerful classifiers like gradient-boosted trees and deep neural networks. These models can handle complex probability distributions and allow us to perform significantly better compared to the prior state of the art, for high-dimensional CI testing. The main technical challenge in the classification problem is the need for samples from the conditional product distribution $f^{CI}(x,y,z) = f(x|z)f(y|z)f(z)$ -- the joint distribution if and only if $X \perp Y | Z.$ -- when given access only to i.i.d. samples from the true joint distribution $f(x,y,z)$. To tackle this problem we propose a novel nearest neighbor bootstrap procedure and theoretically show that our generated samples are indeed close to $f^{CI}$ in terms of total variational distance. We then develop theoretical results regarding the generalization bounds for classification for our problem, which translate into error bounds for CI testing. We provide a novel analysis of Rademacher type classification bounds in the presence of non-i.i.d near-independent samples. We empirically validate the performance of our algorithm on simulated and real datasets and show performance gains over previous methods.

algorithm, health & medicine, survey article, (20 more...)

arXiv.org Machine Learning

1709.06138

Country: North America > United States > Texas (0.14)

Genre: Research Report (0.64)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Add feedback