AITopics

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.96)

Neural Information Processing SystemsDec-25-2025, 01:42:00 GMT

Joint Inference for Neural Network Depth and Dropout Regularization

Dropout regularization methods prune a neural network's pre-determined backbone structure to avoid overfitting. However, a deep model still tends to be poorly calibrated with high confidence on incorrect predictions. We propose a unified Bayesian model selection method to jointly infer the most plausible network depth warranted by data, and perform dropout regularization simultaneously. In particular, to infer network depth we define a beta process over the number of hidden layers which allows it to go to infinity. Layer-wise activation probabilities induced by the beta process modulate neuron activation via binary vectors of a conjugate Bernoulli process. Experiments across domains show that by adapting network depth and dropout regularization to data, our method achieves superior performance comparing to state-of-the-art methods with well-calibrated uncertainty estimates. In continual learning, our method enables neural networks to dynamically evolve their depths to accommodate incrementally available data beyond their initial structures, and alleviate catastrophic forgetting.

joint inference, network depth and dropout regularization, neural network depth, (3 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)

Neural Information Processing SystemsDec-24-2025, 05:38:48 GMT

A Probabilistic State Space Model for Joint Inference from Differential Equations and Data

Mechanistic models with differential equations are a key component of scientific applications of machine learning. Inference in such models is usually computationally demanding because it involves repeatedly solving the differential equation. The main problem here is that the numerical solver is hard to combine with standard inference techniques. Recent work in probabilistic numerics has developed a new class of solvers for ordinary differential equations (ODEs) that phrase the solution process directly in terms of Bayesian filtering. We here show that this allows such methods to be combined very directly, with conceptual and numerical ease, with latent force models in the ODE itself. It then becomes possible to perform approximate Bayesian inference on the latent force as well as the ODE solution in a single, linear complexity pass of an extended Kalman filter / smoother -- that is, at the cost of computing a single ODE solution. We demonstrate the expressiveness and performance of the algorithm by training, among others, a non-parametric SIRD model on data from the COVID-19 outbreak.

differential equation, joint inference, probabilistic state space model, (7 more...)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.60)
Health & Medicine > Therapeutic Area > Immunology (0.60)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.78)
Information Technology > Artificial Intelligence > Machine Learning (0.78)

Neural Information Processing SystemsOct-9-2025, 15:48:38 GMT

Appendix: A Probabilistic State Space Model for Joint Inference from Differential Equations and Data

Appendix A.1 defines the augmented state-space model that formalizes the dynamics of the Gauss-Markov processes introduced in Section 3.1. Appendix A.2 provides the equations for prediction and update steps of the extended Kalman filter in such a setup, which is The block-diagonal structure is due to the independent dynamics of the prior processes. In the experiments presented in Sections 5.2 and 5.3 we model the latent contact rate This section is concerned with the exact steps that make up the algorithm summarized in Section 3.4. The stochastic differential equation defined in Eq. As detailed in Section 3, two different update steps are defined for two kinds of observations.

bundesregierung, differential equation, section 5, (12 more...)

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.17)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Nguyen, Duc-An, Colombatto, Clara, Fleming, Steve, Posner, Ingmar, Hawes, Nick, Bhattacharyya, Raunak

Enhancing Joint Human-AI Inference in Robot Missions: A Confidence-Based Approach

arXiv.org Artificial IntelligenceAug-6-2025

Joint human-AI inference holds immense potential to improve outcomes in human-supervised robot missions. Current day missions are generally in the AI-assisted setting, where the human operator makes the final inference based on the AI recommendation. However, due to failures in human judgement on when to accept or reject the AI recommendation, complementarity is rarely achieved. We investigate joint human-AI inference where the inference made with higher confidence is selected. Through a user study with N = 100 participants on a representative simulated robot teleoperation task, specifically studying the inference of robots' control delays we show that: a) Joint inference accuracy is higher and its extent is regulated by the confidence calibration of the AI agent, and b) Humans change their inferences based on AI recommendations and the extent and direction of this change is also regulated by the confidence calibration of the AI agent. Interestingly, our results show that pairing poorly-calibrated AI-DSS with humans hurts performance instead of helping the team, reiterating the need for AI-based decision support systems with good metacognitive sensitivity. To the best of our knowledge, our study presents the first application of a maximum-confidence-based heuristic for joint human-AI inference within a simulated robot teleoperation task.

ai-dss, artificial intelligence, inference, (14 more...)

2508.03293

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area (0.94)
Health & Medicine > Diagnostic Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.54)

Neural Information Processing SystemsJan-19-2025, 10:25:45 GMT

Joint Inference for Neural Network Depth and Dropout Regularization

joint inference, network depth and dropout regularization, neural network depth, (1 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Neural Information Processing SystemsOct-10-2024, 21:12:15 GMT

A Probabilistic State Space Model for Joint Inference from Differential Equations and Data

differential equation and data, joint inference, probabilistic state space model, (2 more...)

Industry: Health & Medicine > Therapeutic Area (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning (0.63)

Gul, Mustafa Omer, Artzi, Yoav

CoGen: Learning from Feedback with Coupled Comprehension and Generation

arXiv.org Artificial IntelligenceAug-28-2024

Systems with both language comprehension and generation capabilities can benefit from the tight connection between the two. This work studies coupling comprehension and generation with focus on continually learning from interaction with users. We propose techniques to tightly integrate the two capabilities for both learning and inference. We situate our studies in two-player reference games, and deploy various models for thousands of interactions with human users, while learning from interaction feedback signals. We show dramatic improvements in performance over time, with comprehension-generation coupling leading to performance improvements up to 26% in absolute terms and up to 17% higher accuracies compared to a non-coupled system. Our analysis also shows coupling has substantial qualitative impact on the system's language, making it significantly more human-like.

comprehension and generation, interaction, utterance, (15 more...)

2408.15992

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Asia > Singapore (0.04)
Oceania > Australia (0.04)
(8 more...)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.46)

Industry:

Leisure & Entertainment > Games (0.46)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Ahmed, Eltayeb, Mincu, Diana, Harrell, Lauren, Heller, Katherine, Roy, Subhrajit

STUDY: Socially Aware Temporally Causal Decoder Recommender Systems

arXiv.org Artificial IntelligenceSep-5-2023

Recommender systems are widely used to help people find items that are tailored to their interests. These interests are often influenced by social networks, making it important to use social network information effectively in recommender systems. This is especially true for demographic groups with interests that differ from the majority. This paper introduces STUDY, a Socially-aware Temporally caUsal Decoder recommender sYstem. STUDY introduces a new socially-aware recommender system architecture that is significantly more efficient to learn and train than existing methods. STUDY performs joint inference over socially connected groups in a single forward pass of a modified transformer decoder network. We demonstrate the benefits of STUDY in the recommendation of books for students who are dyslexic, or struggling readers. Dyslexic students often have difficulty engaging with reading material, making it critical to recommend books that are tailored to their interests. We worked with our non-profit partner Learning Ally to evaluate STUDY on a dataset of struggling readers. STUDY was able to generate recommendations that more accurately predicted student engagement, when compared with existing methods.

interaction, recommendation, student, (16 more...)

2306.07946

Country:

North America > United States > New York > New York County > New York City (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Genre:

Instructional Material (1.00)
Overview (0.68)
Research Report > New Finding (0.46)

Industry:

Information Technology > Services (0.87)
Education > Educational Setting > Online (0.68)
Health & Medicine > Therapeutic Area > Neurology (0.54)
(2 more...)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

arXiv.org Artificial IntelligenceApr-11-2023

ADI: Adversarial Dominating Inputs in Vertical Federated Learning Systems

Pang, Qi, Yuan, Yuanyuan, Wang, Shuai, Zheng, Wenting

Vertical federated learning (VFL) system has recently become prominent as a concept to process data distributed across many individual sources without the need to centralize it. Multiple participants collaboratively train models based on their local data in a privacy-aware manner. To date, VFL has become a de facto solution to securely learn a model among organizations, allowing knowledge to be shared without compromising privacy of any individuals. Despite the prosperous development of VFL systems, we find that certain inputs of a participant, named adversarial dominating inputs (ADIs), can dominate the joint inference towards the direction of the adversary's will and force other (victim) participants to make negligible contributions, losing rewards that are usually offered regarding the importance of their contributions in federated learning scenarios. We conduct a systematic study on ADIs by first proving their existence in typical VFL systems. We then propose gradient-based methods to synthesize ADIs of various formats and exploit common VFL systems. We further launch greybox fuzz testing, guided by the saliency score of ``victim'' participants, to perturb adversary-controlled inputs and systematically explore the VFL attack surface in a privacy-preserving manner. We conduct an in-depth study on the influence of critical parameters and settings in synthesizing ADIs. Our study reveals new VFL attack opportunities, promoting the identification of unknown threats before breaches and building more secure VFL systems.

artificial intelligence, machine learning, participant, (17 more...)

2201.02775

Country:

North America > United States > Virginia (0.04)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > Singapore (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology > Security & Privacy (1.00)
Banking & Finance (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)