AITopics

2310.02694

Country:

Europe > Denmark > Capital Region > Copenhagen (0.04)
Africa > Senegal > Kolda Region > Kolda (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Kolossov, Germain, Montanari, Andrea, Tandon, Pulkit

Towards a statistical theory of data selection under weak supervision

arXiv.org Machine LearningOct-4-2023

Given a sample of size $N$, it is often useful to select a subsample of smaller size $n

artificial intelligence, data selection, machine learning, (15 more...)

2309.14563

Country:

North America (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)

arXiv.org Artificial IntelligenceOct-3-2023

Fill in the Blank: Exploring and Enhancing LLM Capabilities for Backward Reasoning in Math Word Problems

Deb, Aniruddha, Oza, Neeva, Singla, Sarthak, Khandelwal, Dinesh, Garg, Dinesh, Singla, Parag

While forward reasoning (i.e. find the answer given the question) has been explored extensively in the recent literature, backward reasoning is relatively unexplored. We examine the backward reasoning capabilities of LLMs on Math Word Problems (MWPs): given a mathematical question and its answer, with some details omitted from the question, can LLMs effectively retrieve the missing information? In this paper, we formally define the backward reasoning task on math word problems and modify three datasets to evaluate this task: GSM8k, SVAMP and MultiArith. Our findings show a significant drop in the accuracy of models on backward reasoning compared to forward reasoning across four SOTA LLMs (GPT4, GPT3.5, PaLM-2, and LLaMa-2). Utilizing the specific format of this task, we propose three novel techniques that improve performance: Rephrase reformulates the given problem into a forward reasoning problem, PAL-Tools combines the idea of Program-Aided LLMs to produce a set of equations that can be solved by an external solver, and Check your Work exploits the availability of natural verifier of high accuracy in the forward direction, interleaving solving and verification steps. Finally, realizing that each of our base methods correctly solves a different set of problems, we propose a novel Bayesian formulation for creating an ensemble over these base methods aided by a verifier to further boost the accuracy by a significant margin. Extensive experimentation demonstrates that our techniques successively improve the performance of LLMs on the backward reasoning task, with the final ensemble-based method resulting in a substantial performance gain compared to the raw LLMs with standard prompting techniques such as chain-of-thought.

accuracy, equation, reasoning, (16 more...)

2310.01991

Genre: Research Report > New Finding (0.86)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Wicker, Matthew, Laurenti, Luca, Patane, Andrea, Paoletti, Nicola, Abate, Alessandro, Kwiatkowska, Marta

Probabilistic Reach-Avoid for Bayesian Neural Networks

arXiv.org Artificial IntelligenceOct-3-2023

Model-based reinforcement learning seeks to simultaneously learn the dynamics of an unknown stochastic environment and synthesise an optimal policy for acting in it. Ensuring the safety and robustness of sequential decisions made through a policy in such an environment is a key challenge for policies intended for safety-critical scenarios. In this work, we investigate two complementary problems: first, computing reach-avoid probabilities for iterative predictions made with dynamical models, with dynamics described by Bayesian neural network (BNN); second, synthesising control policies that are optimal with respect to a given reach-avoid specification (reaching a "target" state, while avoiding a set of "unsafe" states) and a learned BNN model. Our solution leverages interval propagation and backward recursion techniques to compute lower bounds for the probability that a policy's sequence of actions leads to satisfying the reach-avoid specification. Such computed lower bounds provide safety certification for the given policy and BNN model. We then introduce control synthesis algorithms to derive policies maximizing said lower bounds on the safety probability. We demonstrate the effectiveness of our method on a series of control benchmarks characterized by learned BNN dynamics models. On our most challenging benchmark, compared to purely data-driven policies the optimal synthesis algorithm is able to provide more than a four-fold increase in the number of certifiable states and more than a three-fold increase in the average guaranteed reach-avoid probability.

international conference, neural network, probability, (17 more...)

2310.01951

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > Austria > Vienna (0.14)
Europe > Netherlands > South Holland > Delft (0.04)
(2 more...)

Genre:

Workflow (0.67)
Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

arXiv.org Artificial IntelligenceOct-3-2023

Bayesian Personalized Federated Learning with Shared and Personalized Uncertainty Representations

Chen, Hui, Liu, Hengyu, Cao, Longbing, Zhang, Tiancheng

Bayesian personalized federated learning (BPFL) addresses challenges in existing personalized FL (PFL). BPFL aims to quantify the uncertainty and heterogeneity within and across clients towards uncertainty representations by addressing the statistical heterogeneity of client data. In PFL, some recent preliminary work proposes to decompose hidden neural representations into shared and local components and demonstrates interesting results. However, most of them do not address client uncertainty and heterogeneity in FL systems, while appropriately decoupling neural representations is challenging and often ad hoc. In this paper, we make the first attempt to introduce a general BPFL framework to decompose and jointly learn shared and personalized uncertainty representations on statistically heterogeneous client data over time. A Bayesian federated neural network BPFed instantiates BPFL by jointly learning cross-client shared uncertainty and client-specific personalized uncertainty over statistically heterogeneous and randomly participating clients. We further involve continual updating of prior distribution in BPFed to speed up the convergence and avoid catastrophic forgetting. Theoretical analysis and guarantees are provided in addition to the experimental evaluation of BPFed against the diversified baselines.

bpfed, federated learning, representation, (15 more...)

2309.15499

Country:

North America > Canada > Ontario > Toronto (0.14)
Oceania > Australia > New South Wales > Sydney (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

arXiv.org Machine LearningOct-3-2023

Delta-AI: Local objectives for amortized inference in sparse graphical models

Falet, Jean-Pierre, Lee, Hae Beom, Malkin, Nikolay, Sun, Chen, Secrieru, Dragos, Zhang, Dinghuai, Lajoie, Guillaume, Bengio, Yoshua

We present a new algorithm for amortized inference in sparse probabilistic graphical models (PGMs), which we call $\Delta$-amortized inference ($\Delta$-AI). Our approach is based on the observation that when the sampling of variables in a PGM is seen as a sequence of actions taken by an agent, sparsity of the PGM enables local credit assignment in the agent's policy learning objective. This yields a local constraint that can be turned into a local loss in the style of generative flow networks (GFlowNets) that enables off-policy training but avoids the need to instantiate all the random variables for each parameter update, thus speeding up training considerably. The $\Delta$-AI objective matches the conditional distribution of a variable given its Markov blanket in a tractable learned sampler, which has the structure of a Bayesian network, with the same conditional distribution under the target PGM. As such, the trained sampler recovers marginals and conditional distributions of interest and enables inference of partial subsets of variables. We illustrate $\Delta$-AI's effectiveness for sampling from synthetic PGMs and training latent variable models with sparse factor structure.

artificial intelligence, bayesian network, machine learning, (18 more...)

2310.02423

Country:

Asia > Middle East > Jordan (0.04)
North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Lambert, Marc, Bonnabel, Silvère, Bach, Francis

Variational Gaussian approximation of the Kushner optimal filter

arXiv.org Machine LearningOct-3-2023

In estimation theory, the Kushner equation provides the evolution of the probability density of the state of a dynamical system given continuous-time observations. Building upon our recent work, we propose a new way to approximate the solution of the Kushner equation through tractable variational Gaussian approximations of two proximal losses associated with the propagation and Bayesian update of the probability density. The first is a proximal loss based on the Wasserstein metric and the second is a proximal loss based on the Fisher metric. The solution to this last proximal loss is given by implicit updates on the mean and covariance that we proposed earlier. These two variational updates can be fused and shown to satisfy a set of stochastic differential equations on the Gaussian's mean and covariance matrix. This Gaussian flow is consistent with the Kalman-Bucy and Riccati flows in the linear case and generalize them in the nonlinear one.

approximation, artificial intelligence, machine learning, (18 more...)

doi: 10.1007/978-3-031-38271-0_39

2310.01859

Country:

Asia > Middle East > Jordan (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)

arXiv.org Machine LearningOct-3-2023

Conditional Generative Modeling for High-dimensional Marked Temporal Point Processes

Dong, Zheng, Fan, Zekai, Zhu, Shixiang

Point processes offer a versatile framework for sequential event modeling. However, the computational challenges and constrained representational power of the existing point process models have impeded their potential for wider applications. This limitation becomes especially pronounced when dealing with event data that is associated with multi-dimensional or high-dimensional marks such as texts or images. To address this challenge, this study proposes a novel event generative framework for modeling point processes with high-dimensional marks. We aim to capture the distribution of events without explicitly specifying the conditional intensity or probability density function. Instead, we use a conditional generator that takes the history of events as input and generates the high-quality subsequent event that is likely to occur given the prior observations. The proposed framework offers a host of benefits, including considerable representational power to capture intricate dynamics in multi- or even high-dimensional event space, as well as exceptional efficiency in learning the model and generating samples. Our numerical results demonstrate superior performance compared to other state-of-the-art baselines.

data mining, machine learning, point process, (21 more...)

2305.12569

Country:

North America > United States > California (0.15)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
(2 more...)

arXiv.org Artificial IntelligenceOct-2-2023

Efficient Bayesian inference using physics-informed invertible neural networks for inverse problems

Guan, Xiaofei, Wang, Xintong, Wu, Hao, Yang, Zihao, Yu, Peng

In this paper, we introduce an innovative approach for addressing Bayesian inverse problems through the utilization of physics-informed invertible neural networks (PI-INN). The PI-INN framework encompasses two sub-networks: an invertible neural network (INN) and a neural basis network (NB-Net). The primary role of the NB-Net lies in modeling the spatial basis functions characterizing the solution to the forward problem dictated by the underlying partial differential equation. Simultaneously, the INN is designed to partition the parameter vector linked to the input physical field into two distinct components: the expansion coefficients representing the forward problem solution and the Gaussian latent noise. If the forward mapping is precisely estimated, and the statistical independence between expansion coefficients and latent noise is well-maintained, the PI-INN offers a precise and efficient generative model for Bayesian inverse problems, yielding tractable posterior density estimates. As a particular physics-informed deep learning model, the primary training challenge for PI-INN centers on enforcing the independence constraint, which we tackle by introducing a novel independence loss based on estimated density. We support the efficacy and precision of the proposed PI-INN through a series of numerical experiments, including inverse kinematics, 1-dimensional and 2-dimensional diffusion equations, and seismic traveltime tomography. Specifically, our experimental results showcase the superior performance of the proposed independence loss in comparison to the commonly used but computationally demanding kernel-based maximum mean discrepancy loss.

artificial intelligence, machine learning, pi-inn, (20 more...)

2304.12541

Country:

Asia > China (0.29)
Europe (0.14)

Genre:

Research Report > Promising Solution (0.48)
Research Report > New Finding (0.46)
Overview > Innovation (0.34)

Industry: Energy > Oil & Gas > Upstream (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Ramezani, Maryam, Ahadinia, Aryan, Farhadi, Erfan, Rabiee, Hamid R.

DANI: Fast Diffusion Aware Network Inference with Preserving Topological Structure Property

arXiv.org Artificial IntelligenceOct-2-2023

The fast growth of social networks and their data access limitations in recent years has led to increasing difficulty in obtaining the complete topology of these networks. However, diffusion information over these networks is available, and many algorithms have been proposed to infer the underlying networks using this information. The previously proposed algorithms only focus on inferring more links and ignore preserving the critical topological characteristics of the underlying social networks. In this paper, we propose a novel method called DANI to infer the underlying network while preserving its structural properties. It is based on the Markov transition matrix derived from time series cascades, as well as the node-node similarity that can be observed in the cascade behavior from a structural point of view. In addition, the presented method has linear time complexity (increases linearly with the number of nodes, number of cascades, and square of the average length of cascades), and its distributed version in the MapReduce framework is also scalable. We applied the proposed approach to both real and synthetic networks. The experimental results showed that DANI has higher accuracy and lower run time while maintaining structural properties, including modular structure, degree distribution, connected components, density, and clustering coefficients, than well-known network inference methods.

artificial intelligence, data mining, machine learning, (20 more...)

2310.01696

Country:

North America > United States (0.46)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre:

Research Report > New Finding (0.54)
Research Report > Promising Solution (0.34)

Industry:

Information Technology > Services (0.87)
Energy > Oil & Gas > Upstream (0.40)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)
(2 more...)