AITopics

2502.00921

Country:

North America > United States (0.14)
Asia > China (0.04)
Asia > Japan (0.04)
(5 more...)

Genre: Research Report (0.64)

Industry: Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Mildner, Terje, Hamelijnck, Oliver, Giampouras, Paris, Damoulas, Theodoros

Federated Generalised Variational Inference: A Robust Probabilistic Federated Learning Framework

arXiv.org Machine LearningFeb-2-2025

We introduce FedGVI, a probabilistic Federated Learning (FL) framework that is provably robust to both prior and likelihood misspecification. FedGVI addresses limitations in both frequentist and Bayesian FL by providing unbiased predictions under model misspecification, with calibrated uncertainty quantification. Our approach generalises previous FL approaches, specifically Partitioned Variational Inference (Ashman et al., 2022), by allowing robust and conjugate updates, decreasing computational complexity at the clients. We offer theoretical analysis in terms of fixed-point convergence, optimality of the cavity distribution, and provable robustness. Additionally, we empirically demonstrate the effectiveness of FedGVI in terms of improved robustness and predictive performance on multiple synthetic and real world classification data sets.

artificial intelligence, machine learning, posterior, (13 more...)

2502.00846

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Europe > United Kingdom > England > West Midlands > Coventry (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.92)

arXiv.org Machine LearningFeb-1-2025

Stochastic Linear Bandits with Latent Heterogeneity

Chen, Elynn, Chen, Xi, Jing, Wenbo, Liu, Xiao

This paper addresses the critical challenge of latent heterogeneity in online decision-making, where individual responses to business actions vary due to unobserved characteristics. While existing approaches in data-driven decision-making have focused on observable heterogeneity through contextual features, they fall short when heterogeneity stems from unobservable factors such as lifestyle preferences and personal experiences. We propose a novel latent heterogeneous bandit framework that explicitly models this unobserved heterogeneity in customer responses, with promotion targeting as our primary example. Our methodology introduces an innovative algorithm that simultaneously learns latent group memberships and group-specific reward functions. Through theoretical analysis and empirical validation using data from a mobile commerce platform, we establish high-probability bounds for parameter estimation, convergence rates for group classification, and comprehensive regret bounds. Notably, our theoretical analysis reveals two distinct types of regret measures: a ``strong regret'' against an oracle with perfect knowledge of customer memberships, which remains non-sub-linear due to inherent classification uncertainty, and a ``regular regret'' against an oracle aware only of deterministic components, for which our algorithm achieves a sub-linear rate that is minimax optimal in horizon length and dimension. We further demonstrate that existing bandit algorithms ignoring latent heterogeneity incur constant average regret that accumulates linearly over time. Our framework provides practitioners with new tools for decision-making under latent heterogeneity and extends to various business applications, including personalized pricing, resource allocation, and inventory management.

data mining, heterogeneity, machine learning, (18 more...)

2502.00423

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Information Technology (0.34)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.45)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)

Chang, Yingshan, Bisk, Yonatan

Model Successor Functions

arXiv.org Machine LearningJan-31-2025

The notion of generalization has moved away from the classical one defined in statistical learning theory towards an emphasis on out-of-domain generalization (OODG). Recently, there is a growing focus on inductive generalization, where a progression of difficulty implicitly governs the direction of domain shifts. In inductive generalization, it is often assumed that the training data lie in the easier side, while the testing data lie in the harder side. The challenge is that training data are always finite, but a learner is expected to infer an inductive principle that could be applied in an unbounded manner. This emerging regime has appeared in the literature under different names, such as length/logical/algorithmic extrapolation, but a formal definition is lacking. This work provides such a formalization that centers on the concept of model successors. Then we outline directions to adapt well-established techniques towards the learning of model successors. This work calls for restructuring of the research discussion around inductive generalization from fragmented task-centric communities to a more unified effort, focused on universal properties of learning and computation.

artificial intelligence, generalization, machine learning, (16 more...)

2502.00197

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(12 more...)

Genre: Research Report (0.50)

Industry:

Health & Medicine (0.67)
Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
(2 more...)

Quarteroni, Alfio, Gervasio, Paola, Regazzoni, Francesco

Combining physics-based and data-driven models: advancing the frontiers of research with Scientific Machine Learning

Scientific Machine Learning (SciML) is a recently emerged research field which combines physics-based and data-driven models for the numerical approximation of differential problems. Physics-based models rely on the physical understanding of the problem at hand, subsequent mathematical formulation, and numerical approximation. Data-driven models instead aim to extract relations between input and output data without arguing any causality principle underlining the available data distribution. In recent years, data-driven models have been rapidly developed and popularized. Such a diffusion has been triggered by a huge availability of data (the so-called big data), an increasingly cheap computing power, and the development of powerful machine learning algorithms. SciML leverages the physical awareness of physics-based models and, at the same time, the efficiency of data-driven algorithms. With SciML, we can inject physics and mathematical knowledge into machine learning algorithms. Yet, we can rely on data-driven algorithms' capability to discover complex and non-linear patterns from data and improve the descriptive capacity of physics-based models. After recalling the mathematical foundations of digital modelling and machine learning algorithms, and presenting the most popular machine learning architectures, we discuss the great potential of a broad variety of SciML strategies in solving complex problems governed by partial differential equations. Finally, we illustrate the successful application of SciML to the simulation of the human cardiac function, a field of significant socio-economic importance that poses numerous challenges on both the mathematical and computational fronts. The corresponding mathematical model is a complex system of non-linear ordinary and partial differential equations describing the electromechanics, valve dynamics, blood circulation, perfusion in the coronary tree, and torso potential. Despite the robustness and accuracy of physics-based models, certain aspects, such as unveiling constitutive laws for cardiac cells and myocardial material properties, as well as devising efficient reduced order models to dominate the extraordinary computational complexity, have been successfully tackled by leveraging data-driven models.

artificial intelligence, cardiac electromechanical model, machine learning, (21 more...)

2501.18708

Country:

North America > United States (1.00)
Europe (1.00)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Information Technology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Health Care Technology (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Machine LearningJan-30-2025

Beyond Prior Limits: Addressing Distribution Misalignment in Particle Filtering

Shi, Yiwei, Hu, Jingyu, Zhang, Yu, Yang, Mengyue, Zhang, Weinan, Liu, Cunjia, Liu, Weiru

Particle filtering is a Bayesian inference method and a fundamental tool in state estimation for dynamic systems, but its effectiveness is often limited by the constraints of the initial prior distribution, a phenomenon we define as the Prior Boundary Phenomenon. This challenge arises when target states lie outside the prior's support, rendering traditional particle filtering methods inadequate for accurate estimation. Although techniques like unbounded priors and larger particle sets have been proposed, they remain computationally prohibitive and lack adaptability in dynamic scenarios. To systematically overcome these limitations, we propose the Diffusion-Enhanced Particle Filtering Framework, which introduces three key innovations: adaptive diffusion through exploratory particles, entropy-driven regularisation to prevent weight collapse, and kernel-based perturbations for dynamic support expansion. These mechanisms collectively enable particle filtering to explore beyond prior boundaries, ensuring robust state estimation for out-of-boundary targets.

artificial intelligence, machine learning, particle, (18 more...)

2501.18501

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Leicestershire > Loughborough (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)

BARNN: A Bayesian Autoregressive and Recurrent Neural Network

Coscia, Dario, Welling, Max, Demo, Nicola, Rozza, Gianluigi

Autoregressive and recurrent networks have achieved remarkable progress across various fields, from weather forecasting to molecular generation and Large Language Models. Despite their strong predictive capabilities, these models lack a rigorous framework for addressing uncertainty, which is key in scientific applications such as PDE solving, molecular generation and Machine Learning Force Fields. To address this shortcoming we present BARNN: a variational Bayesian Autoregressive and Recurrent Neural Network. BARNNs aim to provide a principled way to turn any autoregressive or recurrent model into its Bayesian version. BARNN is based on the variational dropout method, allowing to apply it to large recurrent neural networks as well. We also introduce a temporal version of the "Variational Mixtures of Posteriors" prior (tVAMP-prior) to make Bayesian inference efficient and well-calibrated. Extensive experiments on PDE modelling and molecular generation demonstrate that BARNN not only achieves comparable or superior accuracy compared to existing methods, but also excels in uncertainty quantification and modelling long-range dependencies.

artificial intelligence, bayesian inference, machine learning, (16 more...)

2501.18665

Country: Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Huang, Haiyang, Meng, Tianhui, Jia, Weijia

Joint Optimization of Prompt Security and System Performance in Edge-Cloud LLM Systems

Large language models (LLMs) have significantly facilitated human life, and prompt engineering has improved the efficiency of these models. However, recent years have witnessed a rise in prompt engineering-empowered attacks, leading to issues such as privacy leaks, increased latency, and system resource wastage. Though safety fine-tuning based methods with Reinforcement Learning from Human Feedback (RLHF) are proposed to align the LLMs, existing security mechanisms fail to cope with fickle prompt attacks, highlighting the necessity of performing security detection on prompts. In this paper, we jointly consider prompt security, service latency, and system resource optimization in Edge-Cloud LLM (EC-LLM) systems under various prompt attacks. To enhance prompt security, a vector-database-enabled lightweight attack detector is proposed. We formalize the problem of joint prompt detection, latency, and resource optimization into a multi-stage dynamic Bayesian game model. The equilibrium strategy is determined by predicting the number of malicious tasks and updating beliefs at each stage through Bayesian updates. The proposed scheme is evaluated on a real implemented EC-LLM system, and the results demonstrate that our approach offers enhanced security, reduces the service latency for benign users, and decreases system resource consumption compared to state-of-the-art algorithms.

large language model, latency, machine learning, (22 more...)

2501.18663

Country:

Asia > China > Guangdong Province > Zhuhai (0.05)
Asia > China > Beijing > Beijing (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.84)

Industry:

Information Technology > Security & Privacy (1.00)
Leisure & Entertainment > Games (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)

Scherlis, Adam, Belrose, Nora

Estimating the Probability of Sampling a Trained Neural Network at Random

They evaluate simple mass, under a Gaussian or uniform prior, gradient-free learning algorithms, such as the "Guess & of a region in neural network parameter space Check" optimizer which randomly samples parameters until corresponding to a particular behavior, such as it stumbles upon a network that achieves training loss achieving test loss below some threshold. When under some threshold, and find that these methods have the prior is uniform, this problem is equivalent similar generalization behavior to gradient descent, at least to measuring the volume of a region. We show on the very simple tasks they tested. Teney et al. (2024) empirically and theoretically that existing algorithms find that randomly initialized networks represent very simple for estimating volumes in parameter space functions, which would explain the simplicity bias of underestimate the true volume by millions of orders deep learning if SGD behaves similarly to Guess & Check. of magnitude. We find that this error can be dramatically reduced, but not entirely eliminated, Additionally, Mingard et al. (2021) provide evidence that with an importance sampling method using SGD may be an approximate Bayesian sampler, where the gradient information that is already provided prior distribution over functions is equal to the distribution by popular optimizers. The negative logarithm of over functions represented by randomly initialized networks.

artificial intelligence, machine learning, neighborhood, (19 more...)

2501.18812

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Wang, Hanyang, Branke, Juergen, Poloczek, Matthias

Bayesian Optimization with Preference Exploration by Monotonic Neural Network Ensemble

arXiv.org Machine LearningJan-30-2025

In MOO, there is usually not a single optimal solution, but a range of so-called Pareto optimal or non-dominated Many real-world black-box optimization problems solutions with different trade-offs. A widely adopted approach have multiple conflicting objectives. Rather aims to search for a good representation of these than attempting to approximate the entire set of Pareto-optimal solutions by maximizing their hypervolume. Pareto-optimal solutions, interactive preference Two prominent methods stand out in this regard: ParEGO learning, i.e., optimization with a decision maker (Knowles, 2006), which employs random augmented Chebyshev in the loop, allows to focus the search on the scalarizations for optimization in each iteration, and most relevant subset. However, few previous studies expected hypervolume maximization (Yang et al., 2019; have exploited the fact that utility functions Daulton et al., 2020), which directly maximizes the hypervolume are usually monotonic.

artificial intelligence, bayesian optimization, machine learning, (14 more...)

2501.18792

Country:

North America > United States > California > San Mateo County > Menlo Park (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)
Europe > United Kingdom > England > West Midlands > Coventry (0.04)
Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)