AITopics | Tarokh, Vahid

Collaborating Authors

Tarokh, Vahid

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Deep Extreme Value Copulas for Estimation and Sampling

Hasan, Ali, Elkhalil, Khalil, Pereira, Joao M., Farsiu, Sina, Blanchet, Jose H., Tarokh, Vahid

arXiv.org Machine LearningFeb-17-2021

Modeling the occurrence of extreme events is an important task in many disciplines, such as medicine, environmental science, engineering, and finance. For example, understanding the probability of a patient having an adverse reaction to medication or the distribution of economic shocks is critical to mitigating the associated effects of these events. However, these events are rare in occurrence and often difficult to characterize with traditional statistical tools. This has been the primary focus of extreme value theory (EVT), which describes how to extrapolate the occurrence of rare events outside the range of available data [1]. In the one-dimensional case, EVT provides remarkably simple models for the distribution of the maximum of an infinite number of independent and identically distributed (i.i.d) random variables.

artificial intelligence, banking & finance, neural network, (15 more...)

arXiv.org Machine Learning

2102.09042

Country: North America > United States > Texas (0.14)

Genre: Research Report (1.00)

Industry: Banking & Finance (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

Task-Aware Neural Architecture Search

Le, Cat P., Soltani, Mohammadreza, Ravier, Robert, Tarokh, Vahid

arXiv.org Artificial IntelligenceOct-26-2020

The design of handcrafted neural networks requires a lot of time and resources. Recent techniques in Neural Architecture Search (NAS) have proven to be competitive or better than traditional handcrafted design, although they require domain knowledge and have generally used limited search spaces. In this paper, we propose a novel framework for neural architecture search, utilizing a dictionary of models of base tasks and the similarity between the target task and the atoms of the dictionary; hence, generating an adaptive search space based on the base models of the dictionary. By introducing a gradient-based search algorithm, we can evaluate and discover the best architecture in the search space without fully training the networks. The experimental results show the efficacy of our proposed task-aware approach.

architecture, artificial intelligence, neural network, (17 more...)

arXiv.org Artificial Intelligence

2010.13962

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

Fisher Auto-Encoders

Elkhalil, Khalil, Hasan, Ali, Ding, Jie, Farsiu, Sina, Tarokh, Vahid

arXiv.org Machine LearningOct-23-2020

It has been conjectured that the Fisher divergence is more robust to model uncertainty than the conventional Kullback-Leibler (KL) divergence. This motivates the design of a new class of robust generative auto-encoders (AE) referred to as Fisher auto-encoders. Our approach is to design Fisher AEs by minimizing the Fisher divergence between the intractable joint distribution of observed data and latent variables, with that of the postulated/modeled joint distribution. In contrast to KL-based variational AEs (VAEs), the Fisher AE can exactly quantify the distance between the true and the model-based posterior distributions. Qualitative and quantitative results are provided on both MNIST and celebA datasets demonstrating the competitive performance of Fisher AEs in terms of robustness compared to other AEs such as VAEs and Wasserstein AEs.

artificial intelligence, divergence, neural network, (19 more...)

arXiv.org Machine Learning

2007.0612

Country: North America > United States > Virginia (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Identifying Latent Stochastic Differential Equations with Variational Auto-Encoders

Hasan, Ali, Pereira, João M., Farsiu, Sina, Tarokh, Vahid

arXiv.org Machine LearningOct-14-2020

Variational auto-encoders (VAEs) are a widely used tool to learn lower-dimensional latent representations of high-dimensional data. However, the learned latent representations often lack interpretability, and it is challenging to extract relevant information from the representation of the dataset in the latent space. In particular, when the high-dimensional data is governed by unknown and lower-dimensional dynamics, arising, for instance, from unknown physical or biological interactions, the latent space representation often fails to bring insight on these dynamics. We propose a VAE-based framework for recovering latent dynamics governed by stochastic differential equations (SDEs). Our motivation for using SDEs is that they are already often used to model physical and biological phenomena, to study financial markets, and their properties have been extensively studied in the fields of probability and statistics. We believe this method can be useful in describing trajectories of high dimensional data with underlying physical or biological dynamics, with applications such as video data, longitudinal medical data or gene regulatory dynamics.

artificial intelligence, coefficient, neural network, (18 more...)

arXiv.org Machine Learning

2007.06075

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

GeoStat Representations of Time Series for Fast Classification

Ravier, Robert J., Soltani, Mohammadreza, Simões, Miguel, Garagic, Denis, Tarokh, Vahid

arXiv.org Machine LearningOct-11-2020

Recent advances in time series classification have largely focused on methods that either employ deep learning or utilize other machine learning models for feature extraction. Though successful, their power often comes at the requirement of computational complexity. In this paper, we introduce GeoStat representations for time series. GeoStat representations are based off of a generalization of recent methods for trajectory classification, and summarize the information of a time series in terms of comprehensive statistics of (possibly windowed) distributions of easy to compute differential geometric quantities, requiring no dynamic time warping. The features used are intuitive and require minimal parameter tuning. We perform an exhaustive evaluation of GeoStat on a number of real datasets, showing that simple KNN and SVM classifiers trained on these representations exhibit surprising performance relative to modern single model methods requiring significant computational power, achieving state of the art results in many cases. In particular, we show that this methodology achieves good performance on a challenging dataset involving the classification of fishing vessels, where our methods achieve good performance relative to the state of the art despite only having access to approximately two percent of the dataset used in training and evaluating this state of the art.

deep learning, marine transportation, time series, (22 more...)

arXiv.org Machine Learning

2007.06682

Country: North America (0.14)

Genre: Research Report (1.00)

Industry:

Food & Agriculture > Fishing (0.66)
Transportation > Marine (0.48)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.66)

Add feedback

HeteroFL: Computation and Communication Efficient Federated Learning for Heterogeneous Clients

Diao, Enmao, Ding, Jie, Tarokh, Vahid

arXiv.org Machine LearningOct-2-2020

Federated Learning (FL) is a method of training machine learning models on private data distributed over a large number of possibly heterogeneous clients such as mobile phones and IoT devices. In this work, we propose a new federated learning framework named HeteroFL to address heterogeneous clients equipped with very different computation and communication capabilities. Our solution can enable the training of heterogeneous local models with varying computation complexities and still produce a single global inference model. For the first time, our method challenges the underlying assumption of existing work that local models have to share the same architecture as the global model. We demonstrate several strategies to enhance FL training and conduct extensive empirical evaluations, including five computation complexity levels of three model architecture on three datasets. We show that adaptively distributing subnetworks according to clients' capabilities is both computation and communication efficient. Mobile devices and the Internet of Things (IoT) devices are becoming the primary computing resource for billions of users worldwide (Lim et al., 2020).

computation complexity level, deep learning, neural network, (16 more...)

arXiv.org Machine Learning

2010.01264

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.68)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Projected Latent Markov Chain Monte Carlo: Conditional Sampling of Normalizing Flows

Cannella, Chris, Soltani, Mohammadreza, Tarokh, Vahid

arXiv.org Machine LearningSep-30-2020

We introduce Projected Latent Markov Chain Monte Carlo (PL-MCMC), a technique for sampling from the high-dimensional conditional distributions learned by a normalizing flow. We prove that a Metropolis-Hastings implementation of PL-MCMC asymptotically samples from the exact conditional distributions associated with a normalizing flow. As a conditional sampling method, PL-MCMC enables Monte Carlo Expectation Maximization (MC-EM) training of normalizing flows from incomplete data. Through experimental tests applying normalizing flows to missing data tasks for a variety of data sets, we demonstrate the efficacy of PL-MCMC for conditional sampling from normalizing flows. Conditional sampling from modeled joint probability distributions offers a statistical framework for approaching tasks involving missing and incomplete data. Deep generative models have demonstrated an exceptional capability for approximating the distributions governing complex data. Brief analysis illustrates a fundamental guarantee for generative models: the inaccuracy (i.e. Quite often, otherwise well trained generative models possess a capability for conditional inference that is regrettably locked away from our access. Normalizing flow architectures like RealNVP (Dinh et al., 2014) and GLOW (Kingma & Dhariwal, 2018) have demonstrated accurate and expressive generative performance and showing great promise for application to missing data tasks. Additionally, by enabling the calculation of exact likelihoods, normalizing flows offer convenient mathematical properties for approaching exact conditional sampling. We are therefore motivated to develop techniques for sampling from the exact conditional distributions known by normalizing flows. In this paper, we propose Projected Latent Markov Chain Monte Carlo (PL-MCMC), a conditional sampling technique that takes advantage of the convenient mathematical structure of normalizing flows by defining a Markov Chain within a flow's latent space and accepting proposed transitions based on the likelihood of the resulting imputation.

artificial intelligence, machine learning, pl-mcmc, (18 more...)

arXiv.org Machine Learning

2007.0614

Country: North America > United States (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Model Linkage Selection for Cooperative Learning

Zhou, Jiaying, Ding, Jie, Tan, Kean Ming, Tarokh, Vahid

arXiv.org Machine LearningJun-14-2020

Rapid developments in data collecting devices and computation platforms produce an emerging number of learners and data modalities in many scientific domains. We consider the setting in which each learner holds a pair of parametric statistical model and a specific data source, with the goal of integrating information across a set of learners to enhance the prediction accuracy of a specific learner. One natural way to integrate information is to build a joint model across a set of learners that shares common parameters of interest. However, the parameter sharing patterns across a set of learners are not known a priori. Misspecifying the parameter sharing patterns and the parametric statistical model for each learner yields a biased estimator and degrades the prediction accuracy of the joint model. In this paper, we propose a novel framework for integrating information across a set of learners that is robust against model misspecification and misspecified parameter sharing patterns. The main crux is to sequentially incorporates additional learners that can enhance the prediction accuracy of an existing joint model based on a user-specified parameter sharing patterns across a set of learners, starting from a model with one learner. Theoretically, we show that the proposed method can data-adaptively select the correct parameter sharing patterns based on a user-specified parameter sharing patterns, and thus enhances the prediction accuracy of a learner. Extensive numerical studies are performed to evaluate the performance of the proposed method.

deep learning, learner, neural network, (23 more...)

arXiv.org Machine Learning

2005.07342

Country: North America > United States (0.27)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
Information Technology > Information Management (0.88)
(2 more...)

Add feedback

On Optimal Generalizability in Parametric Learning

Beirami, Ahmad, Razaviyayn, Meisam, Shahrampour, Shahin, Tarokh, Vahid

Neural Information Processing SystemsFeb-14-2020, 13:12:34 GMT

We consider the parametric learning problem, where the objective of the learner is determined by a parametric loss function. Employing empirical risk minimization with possibly regularization, the inferred parameter vector will be biased toward the training samples. Such bias is measured by the cross validation procedure in practice where the data set is partitioned into a training set used for training and a validation set, which is not used in training and is left to measure the out-of-sample performance. A classical cross validation strategy is the leave-one-out cross validation (LOOCV) where one sample is left out for validation and training is done on the rest of the samples that are presented to the learner, and this process is repeated on all of the samples. LOOCV is rarely used in practice due to the high computational complexity.

artificial intelligence, machine learning, optimal generalizability, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Deep Clustering of Compressed Variational Embeddings

Wu, Suya, Diao, Enmao, Ding, Jie, Tarokh, Vahid

arXiv.org Machine LearningOct-23-2019

ABSTRACT Motivated by the ever-increasing demands for limited communication bandwidth and low-power consumption, we propose a new methodology, named joint V ariational Autoen-coders with Bernoulli mixture models (V AB), for performing clustering in the compressed data domain. The idea is to reduce the data dimension by V ariational Autoencoders (V AEs) and group data representations by Bernoulli mixture models (BMMs). Once jointly trained for compression and clustering, the model can be decomposed into two parts: a data vendor that encodes the raw data into compressed data, and a data consumer that classifies the received (compressed) data. To enable training using the gradient descent algorithm, we propose to use the Gumbel-Softmax distribution to resolve the infeasibility of the back-propagation algorithm when assessing categorical samples. Index T erms -- Clustering, V ariational Autoencoder (V AE), Bernoulli Mixture Model (BMM) 1. INTRODUCTION Clustering is a fundamental task with applications in medical imaging, social network analysis, bioinformatics, computer graphics, etc. Applying classical clustering methods directly to high dimensional data may be computational inefficient and suffer from instability.

artificial intelligence, neural network, representation, (18 more...)

arXiv.org Machine Learning

1910.10341

Country: North America > United States > Minnesota (0.29)

Genre: Research Report (0.82)

Industry: Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback