AITopics

#artificialintelligenceNov-28-2019, 07:36:34 GMT

Using Machine Learning to know if your chest pain is the sign of Heart Disease or not

chest pain, classification, heart disease, (12 more...)

#artificialintelligence

Country:

North America > United States > California (0.25)
Europe > Switzerland > Zürich > Zürich (0.16)
Europe > Switzerland > Basel-City > Basel (0.05)
Europe > Hungary > Budapest > Budapest (0.05)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.32)

arXiv.org Artificial IntelligenceNov-28-2019

Option-critic in cooperative multi-agent systems

Chakravorty, Jhelum, Ward, Nadeem, Roy, Julien, Chevalier-Boisvert, Maxime, Basu, Sumana, Lupu, Andrei, Precup, Doina

In this paper, we investigate learning temporal abstractions in cooperative multi-agent systems using the options framework (Sutton et al, 1999) and provide a model-free algorithm for this problem. First, we address the planning problem for the decentralized POMDP represented by the multi-agent system, by introducing a common information approach. We use common beliefs and broadcasting to solve an equivalent centralized POMDP problem. Then, we propose the Distributed Option Critic (DOC) algorithm, motivated by the work of Bacon et al (2017) in the single-agent setting. Our approach uses centralized option evaluation and decentralized intra-option improvement. We analyze theoretically the asymptotic convergence of DOC and validate its performance in grid-world environments, where we implement DOC using a deep neural network. Our experiments show that DOC performs competitively with state-of-the-art algorithms and that it is scalable when the number of agents increases.

agent, dec-pomdp, information, (14 more...)

arXiv.org Artificial Intelligence

1911.12825

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Canada > Quebec > Montreal (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.91)
(2 more...)

Paterson, Colin, Calinescu, Radu

Detection and Mitigation of Rare Subclasses in Neural Network Classifiers

Regions of high-dimensional input spaces that are un-derrepresented in training datasets reduce machine-learnt classifier performance, and may lead to corner cases and unwanted bias for classifiers used in decision making systems. When these regions belong to otherwise well-represented classes, their presence and negative impact are very hard to identify. We propose an approach for the detection and mitigation of such rare subclasses in neural network classifiers. The new approach is underpinned by an easy-to-compute commonality metric that supports the detection of rare subclasses, and comprises methods for reducing their impact during both model training and model exploitation.

commonality score, rare subclass, training data, (14 more...)

1911.1278

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.60)

Continuous Dropout

Shen, Xu, Tian, Xinmei, Liu, Tongliang, Xu, Fang, Tao, Dacheng

Dropout has been proven to be an effective algorithm for training robust deep networks because of its ability to prevent overfitting by avoiding the co-adaptation of feature detectors. Current explanations of dropout include bagging, naive Bayes, regularization, and sex in evolution. According to the activation patterns of neurons in the human brain, when faced with different situations, the firing rates of neurons are random and continuous, not binary as current dropout does. Inspired by this phenomenon, we extend the traditional binary dropout to continuous dropout. On the one hand, continuous dropout is considerably closer to the activation characteristics of neurons in the human brain than traditional binary dropout. On the other hand, we demonstrate that continuous dropout has the property of avoiding the co-adaptation of feature detectors, which suggests that we can extract more independent feature detectors for model averaging in the test stage. We introduce the proposed continuous dropout to a feedforward neural network and comprehensively compare it with binary dropout, adaptive dropout, and DropConnect on MNIST, CIFAR-10, SVHN, NORB, and ILSVRC-12. Thorough experiments demonstrate that our method performs better in preventing the co-adaptation of feature detectors and improves test performance. The code is available at: https://github.com/jasonustc/caffe-multigpu/tree/dropout.

continuous dropout, dropout, gaussian dropout, (12 more...)

1911.12675

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > China (0.04)
Oceania > Australia (0.04)
North America > United States > Texas > Travis County > Austin (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Vinayak, Ramya Korlakai, Kong, Weihao, Kakade, Sham M.

Optimal Estimation of Change in a Population of Parameters

Paired estimation of change in parameters of interest over a population plays a central role in several application domains including those in the social sciences, epidemiology, medicine and biology. In these domains, the size of the population under study is often very large, however, the number of observations available per individual in the population is very small (\emph{sparse observations}) which makes the problem challenging. Consider the setting with $N$ independent individuals, each with unknown parameters $(p_i, q_i)$ drawn from some unknown distribution on $[0, 1]^2$. We observe $X_i \sim \text{Bin}(t, p_i)$ before an event and $Y_i \sim \text{Bin}(t, q_i)$ after the event. Provided these paired observations, $\{(X_i, Y_i) \}_{i=1}^N$, our goal is to accurately estimate the \emph{distribution of the change in parameters}, $\delta_i := q_i - p_i$, over the population and properties of interest like the \emph{$\ell_1$-magnitude of the change} with sparse observations ($t\ll N$). We provide \emph{information theoretic lower bounds} on the error in estimating the distribution of change and the $\ell_1$-magnitude of change. Furthermore, we show that the following two step procedure achieves the optimal error bounds: first, estimate the full joint distribution of the paired parameters using the maximum likelihood estimator (MLE) and then estimate the distribution of change and the $\ell_1$-magnitude of change using the joint MLE. Notably, and perhaps surprisingly, these error bounds are of the same order as the minimax optimal error bounds for learning the \emph{full} joint distribution itself (in Wasserstein-1 distance); in other words, estimating the magnitude of the change of parameters over the population is, in a minimax sense, as difficult as estimating the full joint distribution itself.

estimator, joint distribution, polynomial, (16 more...)

1911.12568

Genre: Research Report > Experimental Study (0.46)

Industry: Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Pfister, Niklas, Bauer, Stefan, Peters, Jonas

Learning stable and predictive structures in kinetic systems: Benefits of a causal approach

Learning kinetic systems from data is one of the core challenges in many fields. Identifying stable models is essential for the generalization capabilities of data-driven inference. We introduce a computationally efficient framework, called CausalKinetiX, that identifies structure from discrete time, noisy observations, generated from heterogeneous experiments. The algorithm assumes the existence of an underlying, invariant kinetic model, a key criterion for reproducible research. Results on both simulated and real-world examples suggest that learning the structure of kinetic systems benefits from a causal perspective. The identified variables and models allow for a concise description of the dynamics across multiple experimental settings and can be used for prediction in unseen experiments. We observe significant improvements compared to well established approaches focusing solely on predictive performance, especially for out-of-sample generalization.

causalkinetix, experiment, procedure, (15 more...)

doi: 10.1073/pnas.1905688116

1810.11776

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Oregon > Benton County > Corvallis (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(7 more...)

Genre: Research Report > New Finding (0.45)

Industry:

Health & Medicine > Therapeutic Area > Oncology (0.45)
Health & Medicine > Pharmaceuticals & Biotechnology (0.45)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Modeling & Simulation (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
(2 more...)

#artificialintelligenceNov-27-2019, 09:26:53 GMT

Data Science and Machine Learning – MITU Skillologies

After every weekend session you'll need to solve assignments until next week.

artificial intelligence, data mining, machine learning, (13 more...)

#artificialintelligence

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.75)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.32)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.32)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.32)

Sandler, Adam, Klabjan, Diego, Luo, Yuan

Conditional Hierarchical Bayesian Tucker Decomposition

arXiv.org Machine LearningNov-27-2019

Our research focuses on studying and developing methods for reducing the dimensionality of large datasets, common in biomedical applications. A major problem when learning information about patients based on genetic sequencing data is that there are often more feature variables (genetic data) than observations (patients). This makes direct supervised learning difficult. One way of reducing the feature space is to use latent Dirichlet allocation in order to group genetic variants in an unsupervised manner. Latent Dirichlet allocation is a common model in natural language processing, which describes a document as a mixture of topics, each with a probability of generating certain words. This can be generalized as a Bayesian tensor decomposition to account for multiple feature variables. While we made some progress improving and modifying these methods, our significant contributions are with hierarchical topic modeling. We developed distinct methods of incorporating hierarchical topic modeling, based on nested Chinese restaurant processes and Pachinko Allocation Machine, into Bayesian tensor decompositions. We apply these models to predict whether or not patients have autism spectrum disorder based on genetic sequencing data. We examine a dataset from National Database for Autism Research consisting of paired siblings -- one with autism, and the other without -- and counts of their genetic variants. Additionally, we linked the genes with their Reactome biological pathways. We combine this information into a tensor of patients, counts of their genetic variants, and the membership of these genes in pathways. Once we decompose this tensor, we use logistic regression on the reduced features in order to predict if patients have autism. We also perform a similar analysis of a dataset of patients with one of four common types of cancer (breast, lung, prostate, and colorectal).

decomposition, probability, topic model, (17 more...)

1911.12426

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Neurology > Autism (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

Eshragh, Ali, Roosta, Fred, Nazari, Asef, Mahoney, Michael W.

LSAR: Efficient Leverage Score Sampling Algorithm for the Analysis of Big Time Series Data

arXiv.org Machine LearningNov-27-2019

We apply methods from randomized numerical linear algebra (RandNLA) to develop improved algorithms for the analysis of large-scale time series data. We first develop a new fast algorithm to estimate the leverage scores of an autoregressive (AR) model in big data regimes. We show that the accuracy of approximations lies within $(1+\mathcal{O}(\varepsilon))$ of the true leverage scores with high probability. These theoretical results are subsequently exploited to develop an efficient algorithm, called LSAR, for fitting an appropriate AR model to big time series data. Our proposed algorithm is guaranteed, with high probability, to find the maximum likelihood estimates of the parameters of the underlying true AR model and has a worst case running time that significantly improves those of the state-of-the-art alternatives in big data regimes. Empirical results on large-scale synthetic as well as real data highly support the theoretical results and reveal the efficacy of this new approach. To the best of our knowledge, this paper is the first attempt to establish a nexus between RandNLA and big time series data analysis.

algorithm, leverage score, matrix, (17 more...)

1911.12321

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
Oceania > Australia > Queensland (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)