AITopics | belief distribution

Collaborating Authors

belief distribution

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

The Best of Both Worlds in Network Population Games: Reaching Consensus & Convergence to Equilibrium

Neural Information Processing SystemsFeb-17-2026, 23:01:28 GMT

Reaching consensus and convergence to equilibrium are two major challenges of multi-agent systems. Although each has attracted significant attention, relatively few studies address both challenges at the same time. This paper examines the connection between the notions of consensus and equilibrium in a multi-agent system where multiple interacting sub-populations coexist. We argue that consensus can be seen as an intricate component of intra-population stability, whereas equilibrium can be seen as encoding inter-population stability. We show that smooth fictitious play, a well-known learning model in game theory, can achieve both consensus and convergence to equilibrium in diverse multi-agent settings. Moreover, we show that the consensus formation process plays a crucial role in the seminal thorny problem of equilibrium selection in multi-agent learning.

agent, artificial intelligence, equilibrium, (16 more...)

Neural Information Processing Systems

Country:

Asia > Singapore (0.04)
South America > Argentina > Patagonia > Río Negro Province > Viedma (0.04)
North America > United States > Massachusetts (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (0.66)

Industry: Leisure & Entertainment > Games (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

Heuristic Transformer: Belief Augmented In-Context Reinforcement Learning

Dippel, Oliver, Lisitsa, Alexei, Peng, Bei

arXiv.org Artificial IntelligenceNov-14-2025

Transformers have demonstrated exceptional in-context learning (ICL) capabilities, enabling applications across natural language processing, computer vision, and sequential decision-making. In reinforcement learning, ICL reframes learning as a supervised problem, facilitating task adaptation without parameter updates. Building on prior work leveraging transformers for sequential decision-making, we propose Heuristic Transformer (HT), an in-context reinforcement learning (ICRL) approach that augments the in-context dataset with a belief distribution over rewards to achieve better decision-making. Using a variational auto-encoder (VAE), a low-dimensional stochastic variable is learned to represent the posterior distribution over rewards, which is incorporated alongside an in-context dataset and query states as prompt to the transformer policy. We assess the performance of HT across the Darkroom, Miniworld, and MuJoCo environments, showing that it consistently surpasses comparable baselines in terms of both effectiveness and generalization. Our method presents a promising direction to bridge the gap between belief-based augmentations and transformer-based decision-making.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2511.10251

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

MDPs with a State Sensing Cost

Kapoor, Vansh, Nair, Jayakrishnan

arXiv.org Artificial IntelligenceOct-30-2025

In many practical sequential decision-making problems, tracking the state of the environment incurs a sensing/communication/computation cost. In these settings, the agent's interaction with its environment includes the additional component of deciding when to sense the state, in a manner that balances the value associated with optimal (state-specific) actions and the cost of sensing. We formulate this as an expected discounted cost Markov Decision Process (MDP), wherein the agent incurs an additional cost for sensing its next state, but has the option to take actions while remaining `blind' to the system state. We pose this problem as a classical discounted cost MDP with an expanded (countably infinite) state space. While computing the optimal policy for this MDP is intractable in general, we derive lower bounds on the optimal value function, which allow us to bound the suboptimality gap of any policy. We also propose a computationally efficient algorithm SPI, based on policy improvement, which in practice performs close to the optimal policy. Finally, we benchmark against the state-of-the-art via a numerical case study.

artificial intelligence, machine learning, reinforcement learning, (21 more...)

arXiv.org Artificial Intelligence

2505.0328

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.47)
Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.88)

Add feedback

ESCORT: Efficient Stein-variational and Sliced Consistency-Optimized Temporal Belief Representation for POMDPs

Zhang, Yunuo, Luo, Baiting, Mukhopadhyay, Ayan, Karsai, Gabor, Dubey, Abhishek

arXiv.org Artificial IntelligenceOct-27-2025

In Partially Observable Markov Decision Processes (POMDPs), maintaining and updating belief distributions over possible underlying states provides a principled way to summarize action-observation history for effective decision-making under uncertainty. As environments grow more realistic, belief distributions develop complexity that standard mathematical models cannot accurately capture, creating a fundamental challenge in maintaining representational accuracy. Despite advances in deep learning and probabilistic modeling, existing POMDP belief approximation methods fail to accurately represent complex uncertainty structures such as high-dimensional, multi-modal belief distributions, resulting in estimation errors that lead to suboptimal agent behaviors. To address this challenge, we present ESCORT (Efficient Stein-variational and sliced Consistency-Optimized Representation for Temporal beliefs), a particle-based framework for capturing complex, multi-modal distributions in high-dimensional belief spaces. ESCORT extends SVGD with two key innovations: correlation-aware projections that model dependencies between state dimensions, and temporal consistency constraints that stabilize updates while preserving correlation structures. This approach retains SVGD's attractive-repulsive particle dynamics while enabling accurate modeling of intricate correlation patterns. Unlike particle filters prone to degeneracy or parametric methods with fixed representational capacity, ESCORT dynamically adapts to belief landscape complexity without resampling or restrictive distributional assumptions. We demonstrate ESCORT's effectiveness through extensive evaluations on both POMDP domains and synthetic multi-modal distributions of varying dimensionality, where it consistently outperforms state-of-the-art methods in terms of belief approximation accuracy and downstream decision quality.

artificial intelligence, correlation structure, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2510.21107

Country: North America > United States (0.67)

Genre: Research Report > Promising Solution (0.34)

Industry:

Government (0.46)
Leisure & Entertainment (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Probabilistic Modeling of Intentions in Socially Intelligent LLM Agents

Xia, Feifan, Fang, Yuyang, Li, Defang, Xie, Yantong, Li, Weikang, Li, Yang, Xia, Deguo, Huang, Jizhou

arXiv.org Artificial IntelligenceOct-22-2025

We present a probabilistic intent modeling framework for large language model (LLM) agents in multi-turn social dialogue. The framework maintains a belief distribution over a partner's latent intentions, initialized from contextual priors and dynamically updated through likelihood estimation after each utterance. The evolving distribution provides additional contextual grounding for the policy, enabling adaptive dialogue strategies under uncertainty. Preliminary experiments in the SOTOPIA environment show consistent improvements: the proposed framework increases the Overall score by 9.0% on SOTOPIA-All and 4.1% on SOTOPIA-Hard compared with the Qwen2.5-7B baseline, and slightly surpasses an oracle agent that directly observes partner intentions. These early results suggest that probabilistic intent modeling can contribute to the development of socially intelligent LLM agents.

arxiv preprint arxiv, large language model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2510.18476

Genre: Research Report > New Finding (0.49)

Industry: Education (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Add feedback

Active Tactile Exploration for Rigid Body Pose and Shape Estimation

Gordon, Ethan K., Baraki, Bruke, Bui, Hien, Posa, Michael

arXiv.org Artificial IntelligenceOct-16-2025

General robot manipulation requires the handling of previously unseen objects. Learning a physically accurate model at test time can provide significant benefits in data efficiency, predictability, and reuse between tasks. Tactile sensing can compliment vision with its robustness to occlusion, but its temporal sparsity necessitates careful online exploration to maintain data efficiency. Direct contact can also cause an unrestrained object to move, requiring both shape and location estimation. In this work, we propose a learning and exploration framework that uses only tactile data to simultaneously determine the shape and location of rigid objects with minimal robot motion. We build on recent advances in contact-rich system identification to formulate a loss function that penalizes physical constraint violation without introducing the numerical stiffness inherent in rigid-body contact. Optimizing this loss, we can learn cuboid and convex polyhedral geometries with less than 10s of randomly collected data after first contact. Our exploration scheme seeks to maximize Expected Information Gain and results in significantly faster learning in both simulated and real-robot experiments. More information can be found at https://dairlab.github.io/activetactile

artificial intelligence, constraint-based reasoning, information, (15 more...)

arXiv.org Artificial Intelligence

2510.13595

Country: North America > United States (0.46)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots > Manipulation (0.54)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.34)

Add feedback

The Best of Both Worlds in Network Population Games: Reaching Consensus & Convergence to Equilibrium

Neural Information Processing SystemsOct-9-2025, 11:46:03 GMT

agent, artificial intelligence, equilibrium, (16 more...)

Neural Information Processing Systems

Country:

Asia > Singapore (0.04)
South America > Argentina > Patagonia > Río Negro Province > Viedma (0.04)
North America > United States > Massachusetts (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (0.66)

Industry: Leisure & Entertainment > Games (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

Persuasiveness and Bias in LLM: Investigating the Impact of Persuasiveness and Reinforcement of Bias in Language Models

Roy, Saumya

arXiv.org Artificial IntelligenceAug-25-2025

Warning: This research studies AI persuasion and bias amplification that could be misused; all experiments are for safety evaluation. Large Language Models (LLMs) now generate convincing, human-like text and are widely used in content creation, decision support, and user interactions. Yet the same systems can spread information or misinformation at scale and reflect social biases that arise from data, architecture, or training choices. This work examines how persuasion and bias interact in LLMs, focusing on how imperfect or skewed outputs affect persuasive impact. Specifically, we test whether persona-based models can persuade with fact-based claims while also, unintentionally, promoting misinformation or biased narratives. We introduce a convincer-skeptic framework: LLMs adopt personas to simulate realistic attitudes. Skeptic models serve as human proxies; we compare their beliefs before and after exposure to arguments from convincer models. Persuasion is quantified with Jensen-Shannon divergence over belief distributions. We then ask how much persuaded entities go on to reinforce and amplify biased beliefs across race, gender, and religion. Strong persuaders are further probed for bias using sycophantic adversarial prompts and judged with additional models. Our findings show both promise and risk. LLMs can shape narratives, adapt tone, and mirror audience values across domains such as psychology, marketing, and legal assistance. But the same capacity can be weaponized to automate misinformation or craft messages that exploit cognitive biases, reinforcing stereotypes and widening inequities. The core danger lies in misuse more than in occasional model mistakes. By measuring persuasive power and bias reinforcement, we argue for guardrails and policies that penalize deceptive use and support alignment, value-sensitive design, and trustworthy deployment.

belief distribution, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2508.15798

Country:

Asia (1.00)
Europe (0.92)
North America > United States (0.45)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Government (1.00)
Banking & Finance (1.00)
Media > News (0.74)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Additional Experimental Results

Neural Information Processing SystemsAug-15-2025, 00:11:51 GMT

Reward curves for TOP-RAD and RAD on pixel-based tasks from the DM Control Suite are shown in Figure 7. Figure 7: Results across 10 seeds for DM Control tasks. Each individual run was performed on a single GPU and lasted between 3 and 18 hours, depending on the task and GPU model. The procedures for updating the critics and the actor for TOP-TD3 are described in detail in Algorithm 2 and Algorithm 3. Algorithm 2: UpdateCritics In order to enable adaptation, we make use of an approach inspired by recent results in the model selection for contextual bandits literature. Bandit problems, the "arm" choices in the model selection setting are not stationary arms, but learning algorithms. The objective is to choose in an online manner, the best algorithm for the task at hand.The In figure 5, Ant-v2 we show this to be the case.

additional experimental result, algorithm, parameter 1, (13 more...)

Neural Information Processing Systems

Country: North America > United States (0.05)

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback

Filters

Collaborating Authors

belief distribution

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

03469b1a66e351b18272be23baf3b809-Paper-Conference.pdf

The Best of Both Worlds in Network Population Games: Reaching Consensus & Convergence to Equilibrium

Heuristic Transformer: Belief Augmented In-Context Reinforcement Learning

MDPs with a State Sensing Cost

ESCORT: Efficient Stein-variational and Sliced Consistency-Optimized Temporal Belief Representation for POMDPs

Probabilistic Modeling of Intentions in Socially Intelligent LLM Agents

Active Tactile Exploration for Rigid Body Pose and Shape Estimation

The Best of Both Worlds in Network Population Games: Reaching Consensus & Convergence to Equilibrium

Persuasiveness and Bias in LLM: Investigating the Impact of Persuasiveness and Reinforcement of Bias in Language Models

A Additional Experimental Results