epinet
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > Michigan (0.04)
- Europe > Italy > Emilia-Romagna > Metropolitan City of Bologna > Bologna (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Epistemic Neural Networks
Intelligence relies on an agent's knowledge of what it does not know.This capability can be assessed based on the quality of joint predictions of labels across multiple inputs.In principle, ensemble-based approaches can produce effective joint predictions, but the computational costs of large ensembles become prohibitive.We introduce the epinet: an architecture that can supplement any conventional neural network, including large pretrained models, and can be trained with modest incremental computation to estimate uncertainty.With an epinet, conventional neural networks outperform very large ensembles, consisting of hundreds or more particles, with orders of magnitude less computation.The epinet does not fit the traditional framework of Bayesian neural networks.To accommodate development of approaches beyond BNNs, such as the epinet, we introduce the epistemic neural network (ENN) as a general interface for models that produce joint predictions.
Pretrained Joint Predictions for Scalable Batch Bayesian Optimization of Molecular Designs
Wang-Henderson, Miles, Kaufman, Benjamin, Williams, Edward, Pederson, Ryan, Rossi, Matteo, Howell, Owen, Underkoffler, Carl, Mardirossian, Narbe, Parkhill, John
Batched synthesis and testing of molecular designs is the key bottleneck of drug development. There has been great interest in leveraging biomolecular foundation models as surrogates to accelerate this process. In this work, we show how to obtain scalable probabilistic surrogates of binding affinity for use in Batch Bayesian Optimization (Batch BO). This demands parallel acquisition functions that hedge between designs and the ability to rapidly sample from a joint predictive density to approximate them. Through the framework of Epistemic Neural Networks (ENNs), we obtain scalable joint predictive distributions of binding affinity on top of representations taken from large structure-informed models. Key to this work is an investigation into the importance of prior networks in ENNs and how to pretrain them on synthetic data to improve downstream performance in Batch BO. Their utility is demonstrated by rediscovering known potent EGFR inhibitors on a semi-synthetic benchmark in up to 5x fewer iterations, as well as potent inhibitors from a real-world small-molecule library in up to 10x fewer iterations, offering a promising solution for large-scale drug discovery applications.
Improved Exploration in GFlownets via Enhanced Epistemic Neural Networks
Muhammad, Sajan, Lahlou, Salem
Efficiently identifying the right trajectories for training remains an open problem in GFlowNets. To address this, it is essential to prioritize exploration in regions of the state space where the reward distribution has not been sufficiently learned. This calls for uncertainty-driven exploration, in other words, the agent should be aware of what it does not know. This attribute can be measured by joint predictions, which are particularly important for combinatorial and sequential decision problems. In this research, we integrate epistemic neural networks (ENN) with the conventional architecture of GFlowNets to enable more efficient joint predictions and better uncertainty quantification, thereby improving exploration and the identification of optimal trajectories. Our proposed algorithm, ENN-GFN-Enhanced, is compared to the baseline method in GFlownets and evaluated in grid environments and structured sequence generation in various settings, demonstrating both its efficacy and efficiency.
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > Michigan (0.04)
- Europe > Italy > Emilia-Romagna > Metropolitan City of Bologna > Bologna (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
SPIEDiff: robust learning of long-time macroscopic dynamics from short-time particle simulations with quantified epistemic uncertainty
The data-driven discovery of long-time macroscopic dynamics and thermodynamics of dissipative systems with particle fidelity is hampered by significant obstacles. These include the strong time-scale limitations inherent to particle simulations, the non-uniqueness of the thermodynamic potentials and operators from given macroscopic dynamics, and the need for efficient uncertainty quantification. This paper introduces Statistical-Physics Informed Epistemic Diffusion Models (SPIEDiff), a machine learning framework designed to overcome these limitations in the context of purely dissipative systems by leveraging statistical physics, conditional diffusion models, and epinets. We evaluate the proposed framework on stochastic Arrhenius particle processes and demonstrate that SPIEDiff can accurately uncover both thermodynamics and kinetics, while enabling reliable long-time macroscopic predictions using only short-time particle simulation data. SPIEDiff can deliver accurate predictions with quantified uncertainty in minutes, drastically reducing the computational demand compared to direct particle simulations, which would take days or years in the examples considered. Overall, SPIEDiff offers a robust and trustworthy pathway for the data-driven discovery of thermodynamic models.
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > Spain > Basque Country > Biscay Province > Bilbao (0.04)
Epinet for Content Cold Start
Jeon, Hong Jun, Liu, Songbin, Li, Yuantong, Lyu, Jie, Song, Hunter, Liu, Ji, Wu, Peng, Zhu, Zheqing
The exploding popularity of online content and its user base poses an evermore challenging matching problem for modern recommendation systems. Unlike other frontiers of machine learning such as natural language, recommendation systems are responsible for collecting their own data. Simply exploiting current knowledge can lead to pernicious feedback loops but naive exploration can detract from user experience and lead to reduced engagement. This exploration-exploitation trade-off is exemplified in the classic multi-armed bandit problem for which algorithms such as upper confidence bounds (UCB) and Thompson sampling (TS) demonstrate effective performance. However, there have been many challenges to scaling these approaches to settings which do not exhibit a conjugate prior structure. Recent scalable approaches to uncertainty quantification via epinets have enabled efficient approximations of Thompson sampling even when the learning model is a complex neural network. In this paper, we demonstrate the first application of epinets to an online recommendation system. Our experiments demonstrate improvements in both user traffic and engagement efficiency on the Facebook Reels online video platform.
- North America > United States > New York > New York County > New York City (0.15)
- Oceania > Australia > New South Wales > Sydney (0.05)
- North America > United States > Washington > King County > Bellevue (0.04)
- (7 more...)
Epistemic Neural Networks
Intelligence relies on an agent's knowledge of what it does not know.This capability can be assessed based on the quality of joint predictions of labels across multiple inputs.In principle, ensemble-based approaches can produce effective joint predictions, but the computational costs of large ensembles become prohibitive.We introduce the epinet: an architecture that can supplement any conventional neural network, including large pretrained models, and can be trained with modest incremental computation to estimate uncertainty.With an epinet, conventional neural networks outperform very large ensembles, consisting of hundreds or more particles, with orders of magnitude less computation.The epinet does not fit the traditional framework of Bayesian neural networks.To accommodate development of approaches beyond BNNs, such as the epinet, we introduce the epistemic neural network (ENN) as a general interface for models that produce joint predictions.
Composite Bayesian Optimization In Function Spaces Using NEON -- Neural Epistemic Operator Networks
Guilhoto, Leonardo Ferreira, Perdikaris, Paris
High-dimensional problems are prominent across all corners of science and industrial applications. Within this realm, optimizing black-box functions and operators can be computationally expensive and require large amounts of hardto-obtain data for training surrogate models. Uncertainty quantification becomes a key element in this setting, as the ability to quantify what a surrogate model does not know offers a guiding principle for new data acquisition. However, existing methods for surrogate modeling with built-in uncertainty quantification, such as Gaussian Processes (GPs) [1], have demonstrated difficulty in modeling problems that exist in high dimensions. While other methods such as Bayesian neural networks [2] (BNNs) and deep ensembles [3] are able to mitigate this issue, their computational cost can still be prohibitive for some applications. This problem becomes more prominent in Operator Learning, where either inputs or outputs of a model are functions residing in infinite-dimensional function spaces. The field of Operator Learning has had many advances in recent years[4, 5, 6, 7, 8, 9], with applications across many domains in the natural sciences and engineering, but so far its integration with uncertainty quantification is limited [10, 11]. In addition to safety-critical problems using deep learning such as ones in medicine [12, 13] and autonomous driving [14], the generation of uncertainty measures can also be important for decision making when collecting new data in the physical sciences. Total uncertainty is often made up of two distinct parts: epistemic and aleatoric uncertainty.
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- Energy (0.68)
- Transportation (0.54)
- Health & Medicine > Therapeutic Area (0.46)
- Government > Regional Government (0.46)
Reducing LLM Hallucinations using Epistemic Neural Networks
Verma, Shreyas, Tran, Kien, Ali, Yusuf, Min, Guangyu
Reducing and detecting hallucinations in large language models is an open research problem. In this project, we attempt to leverage recent advances in the field of uncertainty estimation to reduce hallucinations in frozen large language models. Epistemic neural networks have recently been proposed to improve output joint distributions for large pre-trained models. ENNs are small networks attached to large, frozen models to improve the model's joint distributions and uncertainty estimates. In this work, we train an epistemic neural network on top of the Llama-2 7B model combined with a contrastive decoding feature enhancement technique. We are the first to train an ENN for the next token prediction task and explore the efficacy of this method in reducing hallucinations on the TruthfulQA dataset. In essence, we provide a method that leverages a pre-trained model's latent embeddings to reduce hallucinations. Recently, Large Language Models have become more and more capable of a wide range of language processing tasks such as summarization, sentiment analysis (Scaria et al., 2023), event detection (Anantheswaran et al., 2023), finance (Gupta et al., 2021), synthetic data generation (Gupta et al., 2023b).