AITopics

2007.09114

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.05)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.52)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.36)

Wang, Ruixin, Jaiwal, Prateek, Honnappa, Harsha

Estimating Stochastic Poisson Intensities Using Deep Latent Models

arXiv.org Machine LearningJul-22-2020

We present a new method for estimating the stochastic intensity of a doubly stochastic Poisson process. Statistical and theoretical analyses of traffic traces show that these processes are appropriate models of high intensity traffic arriving at an array of service systems. The statistical estimation of the underlying latent stochastic intensity process driving the traffic model involves a rather complicated nonlinear filtering problem. We develop a novel simulation method, using deep neural networks to approximate the path measures induced by the stochastic intensity process, for solving this nonlinear filtering problem. Our simulation studies demonstrate that the method is quite accurate on both in-sample estimation and on an out-of-sample performance prediction task for an infinite server queue.

artificial intelligence, bayesian inference, machine learning, (18 more...)

2007.06037

Country:

North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)
North America > United States > California (0.04)

Genre: Research Report (0.82)

Industry: Transportation (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Agarwal, Alekh, Kakade, Sham, Krishnamurthy, Akshay, Sun, Wen

FLAMBE: Structural Complexity and Representation Learning of Low Rank MDPs

arXiv.org Machine LearningJul-22-2020

The ability to learn effective transformations of complex data sources, sometimes called representation learning, is an essential primitive in modern machine learning, leading to remarkable achievements in language modeling, vision, and serving as a partial explanation for the success of deep learning more broadly (Bengio et al., 2013). In Reinforcement Learning (RL), several works have shown empirically that learning succinct representations of perceptual inputs can accelerate the search for decision-making policies (Pathak et al., 2017; Tang et al., 2017; Oord et al., 2018; Srinivas et al., 2020). However, representation learning for RL is far more subtle than it is for supervised learning (Du et al., 2019a; Van Roy and Dong, 2019; Lattimore and Szepesvari, 2019), and the theoretical foundations of representation learning for RL are nascent. The first question that arises in this context is: what is a good representation? Intuitively, a good representation should help us achieve greater sample efficiency on downstream tasks.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

2006.10814

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

#artificialintelligenceJul-21-2020, 20:06:18 GMT

A Bayesian model for a simulated meta-analysis

There are multiple ways to estimate a Stan model in R, but I choose to build the Stan code directly rather than using the brms or rstanarm packages. In the Stan code, we need to define the data structure, specify the parameters, specify any transformed parameters (which are just a function of the parameters), and then build the model – which includes laying out the prior distributions as well as the likelihood. In this case, the model is slightly different from what was presented in the context of a mixed effects model. The key difference is that there are prior distributions on $\Delta$ and $\tau$, introducing an additional level of uncertainty into the estimate. I would expect that the estimate of the overall treatment effect $\Delta$ will have a wider 95% CI (credible interval in this context) than the 95% CI (confidence interval) for $\delta_0$ in the mixed effects model.

bayesian inference, bayesian model, machine learning, (3 more...)

#artificialintelligence

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.58)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.58)

Moreira, Catarina, Chou, Yu-Liang, Velmurugan, Mythreyi, Ouyang, Chun, Sindhgatta, Renuka, Bruza, Peter

An Interpretable Probabilistic Approach for Demystifying Black-box Predictive Models

arXiv.org Artificial IntelligenceJul-21-2020

The use of sophisticated machine learning models for critical decision making is faced with a challenge that these models are often applied as a "black-box". This has led to an increased interest in interpretable machine learning, where post hoc interpretation presents a useful mechanism for generating interpretations of complex learning models. In this paper, we propose a novel approach underpinned by an extended framework of Bayesian networks for generating post hoc interpretations of a black-box predictive model. The framework supports extracting a Bayesian network as an approximation of the black-box model for a specific prediction. Compared to the existing post hoc interpretation methods, the contribution of our approach is three-fold. Firstly, the extracted Bayesian network, as a probabilistic graphical model, can provide interpretations about not only what input features but also why these features contributed to a prediction. Secondly, for complex decision problems with many features, a Markov blanket can be generated from the extracted Bayesian network to provide interpretations with a focused view on those input features that directly contributed to a prediction. Thirdly, the extracted Bayesian network enables the identification of four different rules which can inform the decision-maker about the confidence level in a prediction, thus helping the decision-maker assess the reliability of predictions learned by a black-box model. We implemented the proposed approach, applied it in the context of two well-known public datasets and analysed the results, which are made available in an open-source repository.

artificial intelligence, machine learning, prediction, (17 more...)

arXiv.org Artificial Intelligence

2007.10668

Country:

North America > United States > Wisconsin (0.04)
Oceania > Australia > Queensland > Brisbane (0.04)
North America > United States > New York (0.04)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Industry: Transportation > Air (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

What is important about the No Free Lunch theorems?

Wolpert, David H.

The No Free Lunch theorems prove that under a uniform distribution over induction problems (search problems or learning problems), all induction algorithms perform equally. As I discuss in this chapter, the importance of the theorems arises by using them to analyze scenarios involving {non-uniform} distributions, and to compare different algorithms, without any assumption about the distribution over problems at all. In particular, the theorems prove that {anti}-cross-validation (choosing among a set of candidate algorithms based on which has {worst} out-of-sample behavior) performs as well as cross-validation, unless one makes an assumption -- which has never been formalized -- about how the distribution over induction problems, on the one hand, is related to the set of algorithms one is choosing among using (anti-)cross validation, on the other. In addition, they establish strong caveats concerning the significance of the many results in the literature which establish the strength of a particular algorithm without assuming a particular distribution. They also motivate a ``dictionary'' between supervised learning and improve blackbox optimization, which allows one to ``translate'' techniques from supervised learning into the domain of blackbox optimization, thereby strengthening blackbox optimization algorithms. In addition to these topics, I also briefly discuss their implications for philosophy of science.

algorithm, search algorithm, theorem, (13 more...)

2007.10928

Country:

North America > United States > New Mexico (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Poland (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.80)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Abraham, Louis, Becigneul, Gary, Coleman, Benjamin, Scholkopf, Bernhard, Shrivastava, Anshumali, Smola, Alexander

Bloom Origami Assays: Practical Group Testing

We study the problem usually referred to as group testing in the context of COVID-19. Given n samples collected from patients, how should we select and test mixtures of samples to maximize information and minimize the number of tests? Group testing is a well-studied problem with several appealing solutions, but recent biological studies impose practical constraints for COVID-19 that are incompatible with traditional methods. Furthermore, existing methods use unnecessarily restrictive solutions, which were devised for settings with more memory and compute constraints than the problem at hand. This results in poor utility. In the new setting, we obtain strong solutions for small values of n using evolutionary strategies. We then develop a new method combining Bloom filters with belief propagation to scale to larger values of n (more than 100) with good empirical results. We also present a more accurate decoding algorithm that is tailored for specific COVID-19 settings. This work demonstrates the practical gap between dedicated algorithms and well-known generic solutions. Our efforts results in a new and practical multiplex method yielding strong empirical performance without mixing more than a chosen number of patients into the same probe. Finally, we briefly discuss adaptive methods, casting them into the framework of adaptive sub-modularity.

artificial intelligence, bayesian inference, machine learning, (20 more...)

2008.02641

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
North America > United States > Nevada > Clark County > Las Vegas (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Leroy, Arthur, Latouche, Pierre, Guedj, Benjamin, Gey, Servane

MAGMA: Inference and Prediction with Multi-Task Gaussian Processes

We investigate the problem of multiple time series forecasting, with the objective to improve multiple-step-ahead predictions. We propose a multi-task Gaussian process framework to simultaneously model batches of individuals with a common mean function and a specific covariance structure. This common mean is defined as a Gaussian process for which the hyper-posterior distribution is tractable. Therefore an EM algorithm can be derived for simultaneous hyper-parameters optimisation and hyper-posterior computation. Unlike previous approaches in the literature, we account for uncertainty and handle uncommon grids of observations while maintaining explicit formulations, by modelling the mean process in a non-parametric probabilistic framework. We also provide predictive formulas integrating this common mean process. This approach greatly improves the predictive performance far from observations, where information shared across individuals provides a relevant prior mean. Our overall algorithm is called \textsc{Magma} (standing for Multi tAsk Gaussian processes with common MeAn), and publicly available as a R package. The quality of the mean process estimation, predictive performances, and comparisons to alternatives are assessed in various simulated scenarios and on real datasets.

artificial intelligence, data mining, machine learning, (19 more...)

2007.10731

Country:

Europe > France > Île-de-France > Paris > Paris (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Messenger, Daniel A., Bortz, David M.

Weak SINDy For Partial Differential Equations

We extend the WSINDy (Weak SINDy) method of sparse recovery introduced previously by the authors (arXiv:2005.04339) to the setting of partial differential equations (PDEs). As in the case of ODE discovery, the weak form replaces pointwise approximation of derivatives with local integrations against test functions and achieves effective machine-precision recovery of weights from noise-free data (i.e. below the tolerance of the simulation scheme) as well as natural robustness to noise without the use of noise filtering. The resulting WSINDy_PDE algorithm uses separable test functions implemented efficiently via convolutions for discovery of PDE models with computational complexity $O(NM)$ from data points with $M = N^{D+1}$ points, or $N$ points in each of $D+1$ dimensions. We demonstrate on several notoriously challenging PDEs the speed and accuracy with which WSINDy_PDE recovers the correct models from datasets with surprisingly large levels noise (often with levels of noise much greater than 10%).

artificial intelligence, machine learning, wsindy pde, (18 more...)

2007.02848

Country:

North America > United States > Colorado > Boulder County > Boulder (0.14)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)
Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

#artificialintelligenceJul-20-2020, 13:06:42 GMT

Bayesian Inference: The Maximum Entropy Principle

In this article, I will explain what the maximum entropy principle is, how to apply it and why it's useful in the context of Bayesian inference. The code to reproduce the results and figures can be found in this notebook. The maximum entropy principle is a method to create probability distributions that is most consistent with a given set of assumptions and nothing more. The rest of the article will explain what this means. First, we need to a way to measure the uncertainty in a probability distribution.

artificial intelligence, bayesian inference, machine learning, (10 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.84)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.63)