AITopics

2009.01185

Country: North America > United States > Connecticut > Tolland County > Storrs (0.14)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.54)

Hierarchical Marketing Mix Models with Sign Constraints

Chen, Hao, Zhang, Minguang, Han, Lanshan, Lim, Alvin

Marketing mix models (MMMs) are statistical models for measuring the effectiveness of various marketing activities such as promotion, media advertisement, etc. In this research, we propose a comprehensive marketing mix model that captures the hierarchical structure and the carryover, shape and scale effects of certain marketing activities, as well as sign restrictions on certain coefficients that are consistent with common business sense. In contrast to commonly adopted approaches in practice, which estimate parameters in a multi-stage process, the proposed approach estimates all the unknown parameters/coefficients simultaneously using a constrained maximum likelihood approach and solved with the Hamiltonian Monte Carlo algorithm. We present results on real datasets to illustrate the use of the proposed solution algorithm.

artificial intelligence, constraint, machine learning, (20 more...)

2008.12802

Country:

North America > United States > New York (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > Canada > British Columbia (0.04)
Asia > India > NCT > New Delhi (0.04)

Genre: Research Report (0.82)

Industry: Marketing (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Sakhi, Otmane, Bonner, Stephen, Rohde, David, Vasile, Flavian

BLOB : A Probabilistic Model for Recommendation that Combines Organic and Bandit Signals

A common task for recommender systems is to build a pro le of the interests of a user from items in their browsing history and later to recommend items to the user from the same catalog. The users' behavior consists of two parts: the sequence of items that they viewed without intervention (the organic part) and the sequences of items recommended to them and their outcome (the bandit part). In this paper, we propose Bayesian Latent Organic Bandit model (BLOB), a probabilistic approach to combine the 'or-ganic' and 'bandit' signals in order to improve the estimation of recommendation quality. The bandit signal is valuable as it gives direct feedback of recommendation performance, but the signal quality is very uneven, as it is highly concentrated on the recommendations deemed optimal by the past version of the recom-mender system. In contrast, the organic signal is typically strong and covers most items, but is not always relevant to the recommendation task. In order to leverage the organic signal to e ciently learn the bandit signal in a Bayesian model we identify three fundamental types of distances, namely action-history, action-action and history-history distances. We implement a scalable approximation of the full model using variational auto-encoders and the local re-paramerization trick. We show using extensive simulation studies that our method out-performs or matches the value of both state-of-the-art organic-based recommendation algorithms, and of bandit-based methods (both value and policy-based) both in organic and bandit-rich environments.

artificial intelligence, bayesian inference, machine learning, (18 more...)

doi: 10.1145/3394486.3403121

2008.12504

Country:

Europe > France > Île-de-France > Paris > Paris (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(6 more...)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Unifying supervised learning and VAEs -- automating statistical inference in high-energy physics

Glüsenkamp, Thorsten

A KL-divergence objective of the joint distribution of data and labels allows to unify supervised learning, variational autoencoders (VAEs) and semi-supervised learning under one umbrella of variational inference. This viewpoint has several advantages. For VAEs, it clarifies the interpretation of encoder and decoder parts. For supervised learning, it re-iterates that the training procedure approximates the true posterior over labels and can always be viewed as approximate likelihood-free inference. This is typically not discussed, even though the derivation is well-known in the literature. In the context of semi-supervised learning it motivates an extended supervised scheme which allows to calculate a goodness-of-fit p-value using posterior predictive simulations. Flow-based networks with a standard normal base distribution are crucial. We discuss how they allow to rigorously define coverage for arbitrary joint posteriors on $\mathbb{R}^n \times \mathcal{S}^m$, which encompasses posteriors over directions. Finally, systematic uncertainties are naturally included in the variational viewpoint. With the three ingredients of (1) systematics, (2) coverage and (3) goodness-of-fit, flow-based neural networks have the potential to replace a large part of the statistical toolbox of the contemporary high-energy physicist.

artificial intelligence, machine learning, posterior, (18 more...)

2008.05825

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > Bavaria > Middle Franconia > Nuremberg (0.04)

Genre: Research Report (0.71)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
(2 more...)

Farooq, Adam, Raykov, Yordan P., Raykov, Petar, Little, Max A.

Latent feature sharing: an adaptive approach to linear decomposition models

Ubiquitous linear Gaussian exploratory tools such as principle component analysis (PCA) and factor analysis (FA) remain widely used as tools for: exploratory analysis, pre-processing, data visualization and related tasks. However, due to their rigid assumptions including crowding of high dimensional data, they have been replaced in many settings by more flexible and still interpretable latent feature models. The Feature allocation is usually modelled using discrete latent variables assumed to follow either parametric Beta-Bernoulli distribution or Bayesian nonparametric prior. In this work we propose a simple and tractable parametric feature allocation model which can address key limitations of current latent feature decomposition techniques. The new framework allows for explicit control over the number of features used to express each point and enables a more flexible set of allocation distributions including feature allocations with different sparsity levels. This approach is used to derive a novel adaptive Factor analysis (aFA), as well as, an adaptive probabilistic principle component analysis (aPPCA) capable of flexible structure discovery and dimensionality reduction in a wide case of scenarios. We derive both standard Gibbs sampler, as well as, an expectation-maximization inference algorithms that converge orders of magnitude faster to a reasonable point estimate solution. The utility of the proposed aPPCA model is demonstrated for standard PCA tasks such as feature learning, data visualization and data whitening. We show that aPPCA and aFA can infer interpretable high level features both when applied on raw MNIST and when applied for interpreting autoencoder features. We also demonstrate an application of the aPPCA to more robust blind source separation for functional magnetic resonance imaging (fMRI).

artificial intelligence, bayesian inference, machine learning, (18 more...)

2006.12369

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > Canada > Quebec > Montreal (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (0.88)
Health & Medicine > Diagnostic Medicine > Imaging (0.86)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Liao, Zhenyu A., Sharma, Charupriya, Cussens, James, van Beek, Peter

Learning All Credible Bayesian Network Structures for Model Averaging

arXiv.org Artificial IntelligenceAug-27-2020

A Bayesian network is a widely used probabilistic graphical model with applications in knowledge discovery and prediction. Learning a Bayesian network (BN) from data can be cast as an optimization problem using the well-known score-and-search approach. However, selecting a single model (i.e., the best scoring BN) can be misleading or may not achieve the best possible accuracy. An alternative to committing to a single model is to perform some form of Bayesian or frequentist model averaging, where the space of possible BNs is sampled or enumerated in some fashion. Unfortunately, existing approaches for model averaging either severely restrict the structure of the Bayesian network or have only been shown to scale to networks with fewer than 30 random variables. In this paper, we propose a novel approach to model averaging inspired by performance guarantees in approximation algorithms. Our approach has two primary advantages. First, our approach only considers credible models in that they are optimal or near-optimal in score. Second, our approach is more efficient and scales to significantly larger Bayesian networks than existing approaches.

artificial intelligence, bayesian network, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2008.13618

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > Canada > Ontario > Waterloo Region > Waterloo (0.04)
Asia (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Kinney, David, Watson, David

Causal Feature Learning for Utility-Maximizing Agents

arXiv.org Artificial IntelligenceAug-27-2020

Discovering high-level causal relations from low-level data is an important and challenging problem that comes up frequently in the natural and social sciences. In a series of papers, Chalupka et al. (2015, 2016a, 2016b, 2017) develop a procedure for causal feature learning (CFL) in an effort to automate this task. We argue that CFL does not recommend coarsening in cases where pragmatic considerations rule in favor of it, and recommends coarsening in cases where pragmatic considerations rule against it. We propose a new technique, pragmatic causal feature learning (PCFL), which extends the original CFL algorithm in useful and intuitive ways. We show that PCFL has the same attractive measure-theoretic properties as the original CFL algorithm. We compare the performance of both methods through theoretical analysis and experiments.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2005.08792

Country: Pacific Ocean (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.31)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.31)

Iakovleva, Ekaterina, Verbeek, Jakob, Alahari, Karteek

Meta-Learning with Shared Amortized Variational Inference

arXiv.org Machine LearningAug-27-2020

We propose a novel amortized variational inference scheme for an empirical Bayes meta-learning model, where model parameters are treated as latent variables. We learn the prior distribution over model parameters conditioned on limited training data using a variational autoencoder approach. Our framework proposes sharing the same amortized inference network between the conditional prior and variational posterior distributions over the model parameters. While the posterior leverages both the labeled support and query data, the conditional prior is based only on the labeled support data. We show that in earlier work, relying on Monte-Carlo approximation, the conditional prior collapses to a Dirac delta function. In contrast, our variational approach prevents this collapse and preserves uncertainty over the model parameters. We evaluate our approach on the miniImageNet, CIFAR-FS and FC100 datasets, and present results demonstrating its advantages over previous work.

artificial intelligence, bayesian inference, machine learning, (17 more...)

2008.12037

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
North America > United States > California (0.04)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Klebanov, Ilja, Sprungk, Björn, Sullivan, T. J.

The linear conditional expectation in Hilbert space

arXiv.org Machine LearningAug-27-2020

The linear conditional expectation (LCE) provides a best linear (or rather, affine) estimate of the conditional expectation and hence plays an important r\^ole in approximate Bayesian inference, especially the Bayes linear approach. This article establishes the analytical properties of the LCE in an infinite-dimensional Hilbert space context. In addition, working in the space of affine Hilbert--Schmidt operators, we establish a regularisation procedure for this LCE. As an important application, we obtain a simple alternative derivation and intuitive justification of the conditional mean embedding formula, a concept widely used in machine learning to perform the conditioning of random variables by embedding them into reproducing kernel Hilbert spaces.

artificial intelligence, machine learning, theorem 4, (15 more...)

2008.1207

Country:

North America > United States > New York (0.05)
South America > Chile (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)
(6 more...)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)

Park, Chiwoo, Borth, David J., Wilson, Nicholas S., Hunter, Chad N.

Variable selection for Gaussian process regression through a sparse projection

arXiv.org Machine LearningAug-24-2020

This paper presents a new variable selection approach integrated with Gaussian process (GP) regression. We consider a sparse projection of input variables and a general stationary covariance model that depends on the Euclidean distance between the projected features. The sparse projection matrix is considered as an unknown parameter. We propose a forward stagewise approach with embedded gradient descent steps to co-optimize the parameter with other covariance parameters based on the maximization of a non-convex marginal likelihood function with a concave sparsity penalty, and some convergence properties of the algorithm are provided. The proposed model covers a broader class of stationary covariance functions than the existing automatic relevance determination approaches, and the solution approach is more computationally feasible than the existing MCMC sampling procedures for the automatic relevance parameter estimation with a sparsity prior. The approach is evaluated for a large number of simulated scenarios. The choice of tuning parameters and the accuracy of the parameter estimation are evaluated with the simulation study. In the comparison to some chosen benchmark approaches, the proposed approach has provided a better accuracy in the variable selection. It is applied to an important problem of identifying environmental factors that affect an atmospheric corrosion of metal alloys.

bayesian inference, regression, upstream oil & gas, (21 more...)

2008.10769

Country: North America > United States (0.67)

Genre: Research Report (0.82)

Industry:

Government > Military (0.67)
Energy > Oil & Gas > Upstream (0.48)
Materials > Metals & Mining (0.48)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)
(2 more...)