AITopics

1705.10524

Country:

Europe (0.92)
North America > United States (0.92)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Owhadi, Houman, Scovel, Clint

Universal Scalable Robust Solvers from Computational Information Games and fast eigenspace adapted Multiresolution Analysis

We show how the discovery of robust scalable numerical solvers for arbitrary bounded linear operators can be automated as a Game Theory problem by reformulating the process of computing with partial information and limited resources as that of playing underlying hierarchies of adversarial information games. When the solution space is a Banach space $B$ endowed with a quadratic norm $\|\cdot\|$, the optimal measure (mixed strategy) for such games (e.g. the adversarial recovery of $u\in B$, given partial measurements $[\phi_i, u]$ with $\phi_i\in B^*$, using relative error in $\|\cdot\|$-norm as a loss) is a centered Gaussian field $\xi$ solely determined by the norm $\|\cdot\|$, whose conditioning (on measurements) produces optimal bets. When measurements are hierarchical, the process of conditioning this Gaussian field produces a hierarchy of elementary bets (gamblets). These gamblets generalize the notion of Wavelets and Wannier functions in the sense that they are adapted to the norm $\|\cdot\|$ and induce a multi-resolution decomposition of $B$ that is adapted to the eigensubspaces of the operator defining the norm $\|\cdot\|$. When the operator is localized, we show that the resulting gamblets are localized both in space and frequency and introduce the Fast Gamblet Transform (FGT) with rigorous accuracy and (near-linear) complexity estimates. As the FFT can be used to solve and diagonalize arbitrary PDEs with constant coefficients, the FGT can be used to decompose a wide range of continuous linear operators (including arbitrary continuous linear bijections from $H^s_0$ to $H^{-s}$ or to $L^2$) into a sequence of independent linear systems with uniformly bounded condition numbers and leads to $\mathcal{O}(N \operatorname{polylog} N)$ solvers and eigenspace adapted Multiresolution Analysis (resulting in near linear complexity approximation of all eigensubspaces).

game theory, operator, upstream oil & gas, (21 more...)

1703.10761

Country:

North America > United States > California (0.27)
North America > United States > Massachusetts (0.27)
North America > United States > Wisconsin (0.13)
(5 more...)

Genre:

Research Report > New Finding (0.67)
Instructional Material > Course Syllabus & Notes (0.46)

Industry:

Energy > Oil & Gas > Upstream (0.67)
Leisure & Entertainment > Games (0.48)
Government > Regional Government > North America Government > United States Government (0.45)

Technology:

Information Technology > Mathematics of Computing (1.00)
Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
(4 more...)

Differentially Private Bayesian Learning on Distributed Data

Heikkilä, Mikko, Lagerspetz, Eemil, Kaski, Samuel, Shimizu, Kana, Tarkoma, Sasu, Honkela, Antti

Many applications of machine learning, for example in health care, would benefit from methods that can guarantee privacy of data subjects. Differential privacy (DP) has become established as a standard for protecting learning results. The standard DP algorithms require a single trusted party to have access to the entire data, which is a clear weakness. We consider DP Bayesian learning in a distributed setting, where each party only holds a single sample or a few samples of the data. We propose a learning strategy based on a secure multi-party sum function for aggregating summaries from data holders and the Gaussian mechanism for DP. Our method builds on an asymptotically optimal and practically efficient DP Bayesian inference with rapidly diminishing extra cost.

artificial intelligence, machine learning, mechanism, (16 more...)

1703.01106

Country: North America > United States (0.68)

Genre: Research Report (0.82)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Ghosh, Soumya, Doshi-Velez, Finale

Model Selection in Bayesian Neural Networks via Horseshoe Priors

Bayesian Neural Networks (BNNs) have recently received increasing attention for their ability to provide well-calibrated posterior uncertainties. However, model selection---even choosing the number of nodes---remains an open question. In this work, we apply a horseshoe prior over node pre-activations of a Bayesian neural network, which effectively turns off nodes that do not help explain the data. We demonstrate that our prior prevents the BNN from under-fitting even when the number of nodes required is grossly over-estimated. Moreover, this model selection over the number of nodes doesn't come at the expense of predictive or computational performance; in fact, we learn smaller networks with comparable predictive performance to current approaches.

artificial intelligence, machine learning, neural network, (16 more...)

1705.10388

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Fast learning rate of deep learning via a kernel perspective

Suzuki, Taiji

We develop a new theoretical framework to analyze the generalization error of deep learning, and derive a new fast learning rate for two representative algorithms: empirical risk minimization and Bayesian deep learning. The series of theoretical analyses of deep learning has revealed its high expressive power and universal approximation capability. Although these analyses are highly nonparametric, existing generalization error analyses have been developed mainly in a fixed dimensional parametric model. To compensate this gap, we develop an infinite dimensional model that is based on an integral form as performed in the analysis of the universal approximation capability. This allows us to define a reproducing kernel Hilbert space corresponding to each layer. Our point of view is to deal with the ordinary finite dimensional deep neural network as a finite approximation of the infinite dimensional one. The approximation error is evaluated by the degree of freedom of the reproducing kernel Hilbert space in each layer. To estimate a good finite dimensional model, we consider both of empirical risk minimization and Bayesian deep learning. We derive its generalization error bound and it is shown that there appears bias-variance trade-off in terms of the number of parameters of the finite dimensional approximation. We show that the optimal width of the internal layers can be determined through the degree of freedom and the convergence rate can be faster than $O(1/\sqrt{n})$ rate which has been shown in the existing studies.

artificial intelligence, machine learning, neural network, (16 more...)

1705.10182

Country: Asia > Japan (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)

#artificialintelligenceMay-28-2017, 14:31:27 GMT

Essentials of Machine Learning Algorithms (with Python and R Codes)

KNN can easily be mapped to our real lives. If you want to learn about a person, of whom you have no information, you might like to find out about his close friends and the circles he moves in and gain access to his/her information!

algorithm, artificial intelligence, machine learning, (15 more...)

#artificialintelligence

Industry: Education (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.71)

@machinelearnbotMay-27-2017, 14:21:50 GMT

Need for DYNAMICAL Machine Learning: Bayesian exact recursive estimation

In my recent blog, Marrying Kalman Filtering & Machine Learning, we saw the merger of Bayesian exact recursive estimation (algorithm for which is Kalman Filter/Smoother in the linear, Gaussian case) and Machine Learning. We developed a solution called Kernel Projection Kalman Filter for business applications that require static or dynamical, dynamical or time-varying dynamical, linear or non-linear Machine Learning, i.e., pretty much all applications - therefore, Kernel Projection Kalman Filter is a "universal" solution . . . But who needs anything more than STATIC Machine Learning (ML)? Indeed, university courses in ML largely teach static ML. Given a set of inputs and outputs, find a static map between the two during supervised "Training" and use this static map for business purposes during "Operation" (which is called "Testing" during pre-operation evaluation).

artificial intelligence, learning, machine learning, (10 more...)

@machinelearnbot

Industry: Health & Medicine > Therapeutic Area (0.36)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.40)

Motzek, Alexander, Möller, Ralf

Indirect Causes in Dynamic Bayesian Networks Revisited

Journal of Artificial Intelligence ResearchMay-27-2017

Modeling causal dependencies often demands cycles at a coarse-grained temporal scale. If Bayesian networks are to be used for modeling uncertainties, cycles are eliminated with dynamic Bayesian networks, spreading indirect dependencies over time and enforcing an infinitesimal resolution of time. Without a ``causal design,'' i.e., without anticipating indirect influences appropriately in time, we argue that such networks return spurious results. By identifying activator random variables, we propose activator dynamic Bayesian networks (ADBNs) which are able to rapidly adapt to contexts under a causal use of time, anticipating indirect influences on a solid mathematical basis using familiar Bayesian network semantics. ADBNs are well-defined dynamic probabilistic graphical models allowing one to model cyclic dependencies from local and causal perspectives while preserving a classical, familiar calculus and classically known algorithms, without introducing any overhead in modeling or inference.

adbn, bayesian network, random variable, (13 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.5361

AI Access Foundation

11061

Journal of Artificial Intelligence Research

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(17 more...)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Sriperumbudur, Bharath, Fukumizu, Kenji, Gretton, Arthur, Hyvärinen, Aapo, Kumar, Revant

Density Estimation in Infinite Dimensional Exponential Families

arXiv.org Machine LearningMay-26-2017

In this paper, we consider an infinite dimensional exponential family, $\mathcal{P}$ of probability densities, which are parametrized by functions in a reproducing kernel Hilbert space, $H$ and show it to be quite rich in the sense that a broad class of densities on $\mathbb{R}^d$ can be approximated arbitrarily well in Kullback-Leibler (KL) divergence by elements in $\mathcal{P}$. The main goal of the paper is to estimate an unknown density, $p_0$ through an element in $\mathcal{P}$. Standard techniques like maximum likelihood estimation (MLE) or pseudo MLE (based on the method of sieves), which are based on minimizing the KL divergence between $p_0$ and $\mathcal{P}$, do not yield practically useful estimators because of their inability to efficiently handle the log-partition function. Instead, we propose an estimator, $\hat{p}_n$ based on minimizing the \emph{Fisher divergence}, $J(p_0\Vert p)$ between $p_0$ and $p\in \mathcal{P}$, which involves solving a simple finite-dimensional linear system. When $p_0\in\mathcal{P}$, we show that the proposed estimator is consistent, and provide a convergence rate of $n^{-\min\left\{\frac{2}{3},\frac{2\beta+1}{2\beta+2}\right\}}$ in Fisher divergence under the smoothness assumption that $\log p_0\in\mathcal{R}(C^\beta)$ for some $\beta\ge 0$, where $C$ is a certain Hilbert-Schmidt operator on $H$ and $\mathcal{R}(C^\beta)$ denotes the image of $C^\beta$. We also investigate the misspecified case of $p_0\notin\mathcal{P}$ and show that $J(p_0\Vert\hat{p}_n)\rightarrow \inf_{p\in\mathcal{P}}J(p_0\Vert p)$ as $n\rightarrow\infty$, and provide a rate for this convergence under a similar smoothness condition as above. Through numerical simulations we demonstrate that the proposed estimator outperforms the non-parametric kernel density estimator, and that the advantage with the proposed estimator grows as $d$ increases.

artificial intelligence, estimator, machine learning, (19 more...)

1312.3516

Country:

North America > United States (1.00)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

@machinelearnbotMay-25-2017, 18:00:09 GMT

Everything that Works Works Because it's Bayesian: Why Deep Nets Generalize?

The Bayesian community should really start going to ICLR. They really should have started going years ago. For too long we Bayesians have, quite arrogantly, dismissed deep neural networks as unprincipled, dumb black boxes that lack elegance. We said that highly over-parametrised models fitted via maximum likelihood can't possibly work, they will overfit, won't generalise, etc. We touted our Bayesian nonparametric models instead: Chinese restaurants, Indian buffets, Gaussian processes. And, when things started looking really dire for us Bayesians, we even formed an alliance with kernel people, who used to be our mortal enemies just years before because they like convex optimisation.

bayesian, minima, neural network, (12 more...)

@machinelearnbot

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.77)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.76)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.55)