AITopics | user model

Collaborating Authors

user model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

5e0da5da69b71349ae0bd7ad716e4bc9-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 10:26:30 GMT

artificial intelligence, augmentation method, machine learning, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

More Than Irrational: Modeling Belief-Biased Agents

Zhu, Yifan, Katt, Sammie, Kaski, Samuel

arXiv.org Artificial IntelligenceNov-18-2025

Despite the explosive growth of AI and the technologies built upon it, predicting and inferring the sub-optimal behavior of users or human collaborators remains a critical challenge. In many cases, such behaviors are not a result of irrationality, but rather a rational decision made given inherent cognitive bounds and biased beliefs about the world. In this paper, we formally introduce a class of computational-rational (CR) user models for cognitively-bounded agents acting optimally under biased beliefs. The key novelty lies in explicitly modeling how a bounded memory process leads to a dynamically inconsistent and biased belief state and, consequently, sub-optimal sequential decision-making. We address the challenge of identifying the latent user-specific bound and inferring biased belief states from passive observations on the fly. We argue that for our formalized CR model family with an explicit and parameterized cognitive process, this challenge is tractable. To support our claim, we propose an efficient online inference method based on nested particle filtering that simultaneously tracks the user's latent belief state and estimates the unknown cognitive bound from a stream of observed actions. We validate our approach in a representative navigation task using memory decay as an example of a cognitive bound. With simulations, we show that (1) our CR model generates intuitively plausible behaviors corresponding to different levels of memory capacity, and (2) our inference method accurately and efficiently recovers the ground-truth cognitive bounds from limited observations ($\le 100$ steps). We further demonstrate how this approach provides a principled foundation for developing adaptive AI assistants, enabling adaptive assistance that accounts for the user's memory limitations.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2511.12359

Country: Europe (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Belief Revision (0.75)
(2 more...)

Add feedback

SAGE: A Generic Framework for LLM Safety Evaluation

Jindal, Madhur, Shrawgi, Hari, Agrawal, Parag, Dandapat, Sandipan

arXiv.org Artificial IntelligenceOct-28-2025

As Large Language Models are rapidly deployed across diverse applications from healthcare to financial advice, safety evaluation struggles to keep pace. Current benchmarks focus on single-turn interactions with generic policies, failing to capture the conversational dynamics of real-world usage and the application-specific harms that emerge in context. Such potential oversights can lead to harms that go unnoticed in standard safety benchmarks and other current evaluation methodologies. To address these needs for robust AI safety evaluation, we introduce SAGE (Safety AI Generic Evaluation), an automated modular framework designed for customized and dynamic harm evaluations. SAGE employs prompted adversarial agents with diverse personalities based on the Big Five model, enabling system-aware multi-turn conversations that adapt to target applications and harm policies. We evaluate seven state-of-the-art LLMs across three applications and harm policies. Multi-turn experiments show that harm increases with conversation length, model behavior varies significantly when exposed to different user personalities and scenarios, and some models minimize harm via high refusal rates that reduce usefulness. We also demonstrate policy sensitivity within a harm category where tightening a child-focused sexual policy substantially increases measured defects across applications. These results motivate adaptive, policy-aware, and context-specific testing for safer real-world deployment.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2504.19674

Country: Asia (0.93)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Immunology (1.00)
Government (1.00)
Banking & Finance (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A More Examples of the Semantic Inconsistency Problem 1 (a) User A (b) User B

Neural Information Processing SystemsOct-8-2025, 18:36:52 GMT

Each picture represents a news article clicked by the user. Dash borders indicate behaviors replaced by the augmentation method. The data augmentation proportion is set as 0.6. We also find that the behavior sequence augmented by masking well preserves the user's The pseudo-codes of the pre-training procedure with our AdaptSSR are shown in Algorithm 1. Randomly select two augmentation operators f and g from A. With independently sampled dropout masks. With independently sampled dropout masks.

artificial intelligence, augmentation method, machine learning, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

AdaptSSR: Pre-training User Model with Augmentation-Adaptive Self-Supervised Ranking Y ang Y u

Neural Information Processing SystemsOct-8-2025, 18:36:48 GMT

User modeling, which aims to capture users' characteristics or interests, heavily

behavior sequence, machine learning, natural language, (15 more...)

Neural Information Processing Systems

Country: Asia > China > Anhui Province > Hefei (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

The Art of Tool Interface Design

Wu, Yunnan, Chen, Paul, Baranwal, Deshank, Zhou, Jinlong, Yuan, Jian

arXiv.org Artificial IntelligenceMar-26-2025

We present an agentic framework, Thinker, which achieves state of art performance in challenging reasoning tasks for realistic customer service scenarios that involve complex business logic and human interactions via long horizons. On the $\tau$-bench retail dataset, Thinker achieves 82.6\% success rate with GPT-4o (version 2024-06-01) (baseline: 68.3\%), and 81.9\% success rate with Llama-3.1 405B (baseline: 49.6\%), without any fine-tuning. Thinker effectively closes the gap in reasoning capabilities between the base models by introducing proper structure. The key features of the Thinker framework are: (1) State-Machine Augmented Generation (SMAG), which represents business logic as state machines and the LLM uses state machines as tools. (2) Delegation of tasks from the main reasoning loop to LLM-powered tools. (3) Adaptive context management. Our prompting-only solution achieves signficant gains, while still maintaining a standard agentic architecture with a ReAct style reasoning loop. The key is to innovate on the tool interface design, as exemplified by SMAG and the LLM-powered tools.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2503.21036

Country:

North America > United States > District of Columbia > Washington (0.04)
North America > United States > Texas > Tarrant County > Fort Worth (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(2 more...)

Genre:

Workflow (0.68)
Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)

Add feedback

LLM Safety for Children

Rath, Prasanjit, Shrawgi, Hari, Agrawal, Parag, Dandapat, Sandipan

arXiv.org Artificial IntelligenceFeb-18-2025

This paper analyzes the safety of Large Language Models (LLMs) in interactions with children below age of 18 years. Despite the transformative applications of LLMs in various aspects of children's lives such as education and therapy, there remains a significant gap in understanding and mitigating potential content harms specific to this demographic. The study acknowledges the diverse nature of children often overlooked by standard safety evaluations and proposes a comprehensive approach to evaluating LLM safety specifically for children. We list down potential risks that children may encounter when using LLM powered applications. Additionally we develop Child User Models that reflect the varied personalities and interests of children informed by literature in child care and psychology. These user models aim to bridge the existing gap in child safety literature across various fields. We utilize Child User Models to evaluate the safety of six state of the art LLMs. Our observations reveal significant safety gaps in LLMs particularly in categories harmful to children but not adults

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2502.12552

Country:

North America > United States (0.46)
Asia > Middle East (0.28)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Law (1.00)
Health & Medicine > Therapeutic Area (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.69)
Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

Personalized and Sequential Text-to-Image Generation

Nabati, Ofir, Tennenholtz, Guy, Hsu, ChihWei, Ryu, Moonkyung, Ramachandran, Deepak, Chow, Yinlam, Li, Xiang, Boutilier, Craig

arXiv.org Artificial IntelligenceDec-9-2024

We address the problem of personalized, interactive text-to-image (T2I) generation, designing a reinforcement learning (RL) agent which iteratively improves a set of generated images for a user through a sequence of prompt expansions. Using human raters, we create a novel dataset of sequential preferences, which we leverage, together with large-scale open-source (non-sequential) datasets. We construct user-preference and user-choice models using an EM strategy and identify varying user preference types. We then leverage a large multimodal language model (LMM) and a value-based RL approach to suggest a personalized and diverse slate of prompt expansions to the user. Our Personalized And Sequential Text-to-image Agent (PASTA) extends T2I models with personalized multi-turn capabilities, fostering collaborative co-creation and addressing uncertainty or underspecification in a user's intent. We evaluate PASTA using human raters, showing significant improvement compared to baseline methods. We also release our sequential rater dataset and simulated user-rater interactions to support future research in personalized, multi-turn T2I generation.

large language model, machine learning, reinforcement learning, (20 more...)

arXiv.org Artificial Intelligence

2412.10419

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Macao (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Transportation > Ground > Rail (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(3 more...)

Add feedback

Minimizing Live Experiments in Recommender Systems: User Simulation to Evaluate Preference Elicitation Policies

Hsu, Chih-Wei, Mladenov, Martin, Meshi, Ofer, Pine, James, Pham, Hubert, Li, Shane, Liang, Xujian, Polishko, Anton, Yang, Li, Scheetz, Ben, Boutilier, Craig

arXiv.org Artificial IntelligenceSep-25-2024

Evaluation of policies in recommender systems typically involves A/B testing using live experiments on real users to assess a new policy's impact on relevant metrics. This ``gold standard'' comes at a high cost, however, in terms of cycle time, user cost, and potential user retention. In developing policies for ``onboarding'' new users, these costs can be especially problematic, since on-boarding occurs only once. In this work, we describe a simulation methodology used to augment (and reduce) the use of live experiments. We illustrate its deployment for the evaluation of ``preference elicitation'' algorithms used to onboard new users of the YouTube Music platform. By developing counterfactually robust user behavior models, and a simulation service that couples such models with production infrastructure, we are able to test new algorithms in a way that reliably predicts their performance on key metrics when deployed live. We describe our domain, our simulation models and platform, results of experiments and deployment, and suggest future steps needed to further realistic simulation as a powerful complement to live experiments.

artist, proceedings, simulation, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3626772.3661358

2409.17436

Country: