AITopics | Brubaker, Marcus

Collaborating Authors

Brubaker, Marcus

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Amortized In-Context Bayesian Posterior Estimation

Mittal, Sarthak, Bracher, Niels Leif, Lajoie, Guillaume, Jaini, Priyank, Brubaker, Marcus

arXiv.org Machine LearningFeb-10-2025

Bayesian inference provides a natural way of incorporating prior beliefs and assigning a probability measure to the space of hypotheses. Current solutions rely on iterative routines like Markov Chain Monte Carlo (MCMC) sampling and Variational Inference (VI), which need to be re-run whenever new observations are available. Amortization, through conditional estimation, is a viable strategy to alleviate such difficulties and has been the guiding principle behind simulation-based inference, neural processes and in-context methods using pre-trained models. In this work, we conduct a thorough comparative analysis of amortized in-context Bayesian posterior estimation methods from the lens of different optimization objectives and architectural choices. Such methods train an amortized estimator to perform posterior parameter inference by conditioning on a set of data examples passed as context to a sequence model such as a transformer. In contrast to language models, we leverage permutation invariant architectures as the true posterior is invariant to the ordering of context examples. Our empirical study includes generalization to out-of-distribution tasks, cases where the assumed underlying model is misspecified, and transfer from simulated to real problems. Subsequently, it highlights the superiority of the reverse KL estimator for predictive problems, especially when combined with the transformer architecture and normalizing flows.

machine learning, natural language, probabilistic model, (21 more...)

arXiv.org Machine Learning

2502.06601

Country: North America > United States (0.27)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area (0.67)
Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

Point Process Flows

Mehrasa, Nazanin, Deng, Ruizhi, Ahmed, Mohamed Osama, Chang, Bo, He, Jiawei, Durand, Thibaut, Brubaker, Marcus, Mori, Greg

arXiv.org Machine LearningOct-18-2019

Event sequences can be modeled by temporal point processes (TPPs) to capture their asynchronous and probabilistic nature. We contribute an intensity-free framework that directly models the point process as a non-parametric distribution by utilizing normalizing flows. This approach is capable of capturing highly complex temporal distributions and does not rely on restrictive parametric forms. Comparisons with state-of-the-art baseline models on both synthetic and challenging real-life datasets show that the proposed framework is effective at modeling the stochasticity of discrete event sequences.

dataset, deep learning, neural network, (18 more...)

arXiv.org Machine Learning

1910.08281

Genre: Research Report (0.50)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Communications > Social Media (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

Tails of Triangular Flows

Jaini, Priyank, Kobyzev, Ivan, Brubaker, Marcus, Yu, Yaoliang

arXiv.org Machine LearningJul-9-2019

Triangular maps are a construct in probability theory that allows the transformation of any source density to any target density. We consider flow based models that learn these triangular transformations which we call triangular flows and study the properties of these triangular flows with the goal of capturing heavy tailed target distributions. In one dimension, we prove that the density quantile functions of the source and target density can characterize properties of the increasing push-forward transformation and show that no Lipschitz continuous increasing map can transform a light-tailed source to a heavy-tailed target density. We further precisely relate the asymptotic behavior of these density quantile functions with the existence of certain function moments of distributions. These results allow us to give a precise asymptotic rate at which an increasing transformation must grow to capture the tail properties of a target given the source distribution. In the multivariate case, we show that any increasing triangular map transforming a light-tailed source density to a heavy-tailed target density must have all eigenvalues of the Jacobian to be unbounded. Our analysis suggests the importance of source distribution in capturing heavy-tailed distributions and we discuss the implications for flow based models.

artificial intelligence, banking & finance, transformation, (19 more...)

arXiv.org Machine Learning

1907.04481

Country: North America (0.14)

Genre: Research Report (0.64)

Industry: Banking & Finance (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Diachronic Embedding for Temporal Knowledge Graph Completion

Goel, Rishab, Kazemi, Seyed Mehran, Brubaker, Marcus, Poupart, Pascal

arXiv.org Artificial IntelligenceJul-6-2019

Knowledge graphs (KGs) typically contain temporal facts indicating relationships among entities at different times. Due to their incompleteness, several approaches have been proposed to infer new facts for a KG based on the existing ones-a problem known as KG completion. KG embedding approaches have proved effective for KG completion, however, they have been developed mostly for static KGs. Developing temporal KG embedding models is an increasingly important problem. In this paper, we build novel models for temporal KG completion through equipping static models with a diachronic entity embedding function which provides the characteristics of entities at any point in time. This is in contrast to the existing temporal KG embedding approaches where only static entity features are provided. The proposed embedding function is model-agnostic and can be potentially combined with any static model. We prove that combining it with SimplE, a recent model for static KG embedding, results in a fully expressive model for temporal KG completion. Our experiments indicate the superiority of our proposal compared to existing baselines.

completion, deep learning, neural network, (21 more...)

arXiv.org Artificial Intelligence

1907.03143

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Temporal Reasoning (0.64)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.63)

Add feedback

Differentiable probabilistic models of scientific imaging with the Fourier slice theorem

Ullrich, Karen, Berg, Rianne van den, Brubaker, Marcus, Fleet, David, Welling, Max

arXiv.org Machine LearningJun-18-2019

Scientific imaging techniques such as optical and electron microscopy and computed tomography (CT) scanning are used to study the 3D structure of an object through 2D observations. These observations are related to the original 3D object through orthogonal integral projections. For common 3D reconstruction algorithms, computational efficiency requires the modeling of the 3D structures to take place in Fourier space by applying the Fourier slice theorem. At present, it is unclear how to differentiate through the projection operator, and hence current learning algorithms can not rely on gradient based methods to optimize 3D structure models. In this paper we show how back-propagation through the projection operator in Fourier space can be achieved. We demonstrate the validity of the approach with experiments on 3D reconstruction of proteins. We further extend our approach to learning probabilistic models of 3D objects. This allows us to predict regions of low sampling rates or estimate noise. A higher sample efficiency can be reached by utilizing the learned uncertainties of the 3D structure as an unsupervised estimate of the model fit. Finally, we demonstrate how the reconstruction algorithm can be extended with an amortized inference scheme on unknown attributes such as object pose. Through empirical studies we show that joint inference of the 3D structure and the object pose becomes more difficult when the ground truth object contains more symmetries. Due to the presence of for instance (approximate) rotational symmetries, the pose estimation can easily get stuck in local optima, inhibiting a fine-grained high-quality estimate of the 3D structure.

bayesian inference, neural network, projection, (21 more...)

arXiv.org Machine Learning

1906.07582

Country: North America > Canada > Ontario > Toronto (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

On the Effectiveness of Low Frequency Perturbations

Sharma, Yash, Ding, Gavin Weiguang, Brubaker, Marcus

arXiv.org Machine LearningFeb-28-2019

Carefully crafted, often imperceptible, adversarial perturbations have been shown to cause state-of-the-art models to yield extremely inaccurate outputs, rendering them unsuitable for safety-critical application domains. In addition, recent work has shown that constraining the attack space to a low frequency regime is particularly effective. Yet, it remains unclear whether this is due to generally constraining the attack search space or specifically removing high frequency components from consideration. By systematically controlling the frequency components of the perturbation, evaluating against the top-placing defense submissions in the NeurIPS 2017 competition, we empirically show that performance improvements in both optimization and generalization are yielded only when low frequency components are preserved. In fact, the defended models based on (ensemble) adversarial training are roughly as vulnerable to low frequency perturbations as undefended models, suggesting that the purported robustness of proposed defenses is reliant upon adversarial perturbations being high frequency in nature. We do find that under $\ell_\infty$ $\epsilon=16/255$, a commonly used distortion bound, low frequency perturbations are indeed perceptible. This questions the use of the $\ell_\infty$-norm, in particular, as a distortion metric, and suggests that explicitly considering the frequency space is promising for learning robust models which better align with human perception.

artificial intelligence, neural network, perturbation, (16 more...)

arXiv.org Machine Learning

1903.00073

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback