AITopics

1703.10121

Country:

Europe > United Kingdom (0.29)
North America > Canada > Quebec (0.15)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > New Finding (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.99)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.72)

Pan, Indranil, Bester, Dirk

Marginal likelihood based model comparison in Fuzzy Bayesian Learning

arXiv.org Machine LearningMar-29-2017

RADITIONAL rule based fuzzy systems encode expert opinion in the form of IF-THEN rules and optimise some performance metric (typically mean squared error in predicting a data-set) to obtain the hyper-parameters of the model (like the numeric values defining the shape of the membership functions etc.) [2]-[4]. The rule base is one of the core elements driving the model and slight change in the rule base would significantly affect the performance of the fuzzy inference system. Many automated methods have been proposed where the rule base structure and parameters can be automatically determined [5]-[7]. However interpretability of such models is an issue and various methods have been proposed to alleviate the issue [8]. In the present paper however, we are interested in the actual metric for comparison between different models having different rule bases derived from expert opinion. The comparison metric can nevertheless be embedded within an automated framework to evolve the best model if so required. The most straight forward way of comparing the fuzzy rule bases is to optimise the model parameters based on the prediction error (e.g.

artificial intelligence, machine learning, rule base, (19 more...)

1703.09956

Country: Europe > United Kingdom > England (0.46)

Genre: Research Report (0.40)

Industry: Banking & Finance (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
(2 more...)

Kandasamy, Kirthevasan, Dasarathy, Gautam, Oliva, Junier B., Schneider, Jeff, Poczos, Barnabas

Multi-fidelity Gaussian Process Bandit Optimisation

arXiv.org Artificial IntelligenceMar-28-2017

In many scientific and engineering applications, we are tasked with the optimisation of an expensive to evaluate black box function $f$. Traditional settings for this problem assume just the availability of this single function. However, in many cases, cheap approximations to $f$ may be obtainable. For example, the expensive real world behaviour of a robot can be approximated by a cheap computer simulation. We can use these approximations to eliminate low function value regions cheaply and use the expensive evaluations of $f$ in a small but promising region and speedily identify the optimum. We formalise this task as a \emph{multi-fidelity} bandit problem where the target function and its approximations are sampled from a Gaussian process. We develop MF-GP-UCB, a novel method based on upper confidence bound techniques. In our theoretical analysis we demonstrate that it exhibits precisely the above behaviour, and achieves better regret than strategies which ignore multi-fidelity information. Empirically, MF-GP-UCB outperforms such naive strategies and other multi-fidelity methods on several synthetic and real experiments.

data mining, fidelity, machine learning, (20 more...)

arXiv.org Artificial Intelligence

1603.06288

Country: North America > United States (0.67)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.58)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
Information Technology > Data Science > Data Mining > Big Data (0.66)
Information Technology > Artificial Intelligence > Robots (0.66)
(2 more...)

Svensson, Andreas, Schön, Thomas B.

A flexible state space model for learning nonlinear dynamical systems

We consider a nonlinear state-space model with the state transition and observation functions expressed as basis function expansions. The coefficients in the basis function expansions are learned from data. Using a connection to Gaussian processes we also develop priors on the coefficients, for tuning the model flexibility and to prevent overfitting to data, akin to a Gaussian process state-space model. The priors can alternatively be seen as a regularization, and helps the model in generalizing the data without sacrificing the richness offered by the basis function expansion. To learn the coefficients and other unknown parameters efficiently, we tailor an algorithm using state-of-the-art sequential Monte Carlo methods, which comes with theoretical guarantees on the learning. Our approach indicates promising results when evaluated on a classical benchmark as well as real data.

artificial intelligence, bayesian inference, machine learning, (17 more...)

1603.05486

Country:

North America > United States (0.68)
Europe > United Kingdom > England (0.67)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Particle Filtering for PLCA model with Application to Music Transcription

Cazau, D., Revillon, G., Yuancheng, W., Adam, O.

Automatic Music Transcription (AMT) consists in automatically estimating the notes in an audio recording, through three attributes: onset time, duration and pitch. Probabilistic Latent Component Analysis (PLCA) has become very popular for this task. PLCA is a spectrogram factorization method, able to model a magnitude spectrogram as a linear combination of spectral vectors from a dictionary. Such methods use the Expectation-Maximization (EM) algorithm to estimate the parameters of the acoustic model. This algorithm presents well-known inherent defaults (local convergence, initialization dependency), making EM-based systems limited in their applications to AMT, particularly in regards to the mathematical form and number of priors. To overcome such limits, we propose in this paper to employ a different estimation framework based on Particle Filtering (PF), which consists in sampling the posterior distribution over larger parameter ranges. This framework proves to be more robust in parameter estimation, more flexible and unifying in the integration of prior knowledge in the system. Note-level transcription accuracies of 61.8 $\%$ and 59.5 $\%$ were achieved on evaluation sound datasets of two different instrument repertoires, including the classical piano (from MAPS dataset) and the marovany zither, and direct comparisons to previous PLCA-based approaches are provided. Steps for further development are also outlined.

algorithm, artificial intelligence, machine learning, (18 more...)

1703.09772

Country:

Europe (1.00)
North America > United States (0.68)

Genre: Research Report (0.64)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Unifying the Stochastic Spectral Descent for Restricted Boltzmann Machines with Bernoulli or Gaussian Inputs

Fan, Kai

Stochastic gradient descent based algorithms are typically used as the general optimization tools for most deep learning models. A Restricted Boltzmann Machine (RBM) is a probabilistic generative model that can be stacked to construct deep architectures. For RBM with Bernoulli inputs, non-Euclidean algorithm such as stochastic spectral descent (SSD) has been specifically designed to speed up the convergence with improved use of the gradient estimation by sampling methods. However, the existing algorithm and corresponding theoretical justification depend on the assumption that the possible configurations of inputs are finite, like binary variables. The purpose of this paper is to generalize SSD for Gaussian RBM being capable of mod- eling continuous data, regardless of the previous assumption. We propose the gradient descent methods in non-Euclidean space of parameters, via de- riving the upper bounds of logarithmic partition function for RBMs based on Schatten-infinity norm. We empirically show that the advantage and improvement of SSD over stochastic gradient descent (SGD).

algorithm, artificial intelligence, machine learning, (19 more...)

1703.09766

Genre: Research Report (0.40)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.85)

Yasuda, Muneki, Kataoka, Shun

Solving Non-parametric Inverse Problem in Continuous Markov Random Field using Loopy Belief Propagation

In this paper, we address the inverse problem, or the statistical machine learning problem, in Markov random fields with a non-parametric pair-wise energy function with continuous variables. The inverse problem is formulated by maximum likelihood estimation. The exact treatment of maximum likelihood estimation is intractable because of two problems: (1) it includes the evaluation of the partition function and (2) it is formulated in the form of functional optimization. We avoid Problem (1) by using Bethe approximation. Bethe approximation is an approximation technique equivalent to the loopy belief propagation. Problem (2) can be solved by using orthonormal function expansion. Orthonormal function expansion can reduce a functional optimization problem to a function optimization problem. Our method can provide an analytic form of the solution of the inverse problem within the framework of Bethe approximation.

approximation, artificial intelligence, machine learning, (13 more...)

doi: 10.7566/JPSJ.86.084806

1703.09397

Genre: Research Report (0.64)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.86)

Abeille, Marc, Lazaric, Alessandro

Thompson Sampling for Linear-Quadratic Control Problems

arXiv.org Machine LearningMar-27-2017

We consider the exploration-exploitation tradeoff in linear quadratic (LQ) control problems, where the state dynamics is linear and the cost function is quadratic in states and controls. We analyze the regret of Thompson sampling (TS) (a.k.a. posterior-sampling for reinforcement learning) in the frequentist setting, i.e., when the parameters characterizing the LQ dynamics are fixed. Despite the empirical and theoretical success in a wide range of problems from multi-armed bandit to linear bandit, we show that when studying the frequentist regret TS in control problems, we need to trade-off the frequency of sampling optimistic parameters and the frequency of switches in the control policy. This results in an overall regret of $O(T^{2/3})$, which is significantly worse than the regret $O(\sqrt{T})$ achieved by the optimism-in-face-of-uncertainty algorithm in LQ control problems.

big data, inequality, upstream oil & gas, (18 more...)

1703.08972

Country:

Europe (0.28)
North America > United States > New York (0.14)

Genre: Research Report (0.50)

Industry: Energy > Oil & Gas > Upstream (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)
Information Technology > Data Science > Data Mining > Big Data (0.34)

Cheng, Li-Fang, Darnell, Gregory, Chivers, Corey, Draugelis, Michael E, Li, Kai, Engelhardt, Barbara E

Sparse Multi-Output Gaussian Processes for Medical Time Series Prediction

arXiv.org Machine LearningMar-27-2017

In real-time monitoring of hospital patients, high-quality inference of patients' health status using all information available from clinical covariates and lab tests are essential to enable successful medical interventions and improve patient outcomes. In this work, we develop and explore a Bayesian nonparametric model based on Gaussian process (GP) regression for hospital patient monitoring. Our method, MedGP, incorporates 24 clinical and lab covariates and supports a rich reference data set from which the relationships between these observed covariates may be inferred and exploited for high-quality inference of patient state over time. To do this, we develop a highly structured sparse GP kernel to enable tractable computation over tens of thousands of time points while estimating correlations among clinical covariates, patients, and periodicity in high-dimensional time series measurements of physiological signals. We apply MedGP to data from hundreds of thousands of patients treated at the Hospital of the University of Pennsylvania. MedGP has a number of benefits over current methods, including (i) not requiring an alignment of the time series data, (ii) quantifying confidence intervals in the predictions, (iii) exploiting a vast and rich database of patients, and (iv) providing interpretable relationships among clinical covariates. We evaluate and compare results from MedGP on the task of online state prediction for three different patient subgroups. Keywords: Gaussian processes, electronic health records, sparse time series analysis, spectral mixture kernel, kernel density estimation.

artificial intelligence, covariate, machine learning, (20 more...)

1703.09112

Country: North America > United States > Pennsylvania (0.34)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Health Care Providers & Services (1.00)
Health & Medicine > Health Care Technology > Medical Record (0.54)
Health & Medicine > Diagnostic Medicine > Vital Signs (0.49)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.48)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Abeille, Marc, Lazaric, Alessandro

Linear Thompson Sampling Revisited

arXiv.org Machine LearningMar-27-2017

We derive an alternative proof for the regret of Thompson sampling (\ts) in the stochastic linear bandit setting. While we obtain a regret bound of order $\widetilde{O}(d^{3/2}\sqrt{T})$ as in previous results, the proof sheds new light on the functioning of the \ts. We leverage on the structure of the problem to show how the regret is related to the sensitivity (i.e., the gradient) of the objective function and how selecting optimal arms associated to \textit{optimistic} parameters does control it. Thus we show that \ts can be seen as a generic randomized algorithm where the sampling distribution is designed to have a fixed probability of being optimistic, at the cost of an additional $\sqrt{d}$ regret factor compared to a UCB-like approach. Furthermore, we show that our proof can be readily applied to regularized linear optimization and generalized linear model problems.

artificial intelligence, machine learning, probability, (16 more...)

1611.06534

Genre:

Research Report (0.50)
Workflow (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)