AITopics

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.61)

Luis Haug, Sebastian Tschiatschek, Adish Singla

Teaching Inverse Reinforcement Learners via Features and Demonstrations

Neural Information Processing SystemsFeb-12-2026, 18:17:42 GMT

Weintroduceanaturalquantity,the teaching risk, which measures the potential suboptimality of policies that look optimal to the learner in this setting. We show that bounds on the teaching risk guarantee that the learner is able to find a near-optimal policy using standard algorithms basedoninversereinforcement learning. Basedonthesefindings, we suggest a teaching scheme in which the expert can decrease the teaching risk by updating the learner's worldview, and thus ultimately enable her to find a near-optimalpolicy.

machine learning, reinforcement learning, teaching risk, (18 more...)

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > Saarland > Saarbrücken (0.04)

Industry: Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.47)

Neural Information Processing SystemsNov-20-2025, 22:08:44 GMT

Teaching Inverse Reinforcement Learners via Features and Demonstrations

feature and demonstration, name change, teaching inverse reinforcement learner, (4 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.61)

Luis Haug, Sebastian Tschiatschek, Adish Singla

Teaching Inverse Reinforcement Learners via Features and Demonstrations

Neural Information Processing SystemsNov-20-2025, 16:08:56 GMT

Neural Information Processing Systems http://nips.cc/

learner, machine learning, reinforcement learning, (17 more...)

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Robots (0.94)

arXiv.org Artificial IntelligenceOct-20-2025

Self-evolving expertise in complex non-verifiable subject domains: dialogue as implicit meta-RL

Bailey, Richard M.

So-called `wicked problems', those involving complex multi-dimensional settings, non-verifiable outcomes, heterogeneous impacts and a lack of single objectively correct answers, have plagued humans throughout history. Modern examples include decisions over justice frameworks, solving environmental pollution, planning for pandemic resilience and food security. The use of state-of-the-art artificial intelligence systems (notably Large Language Model-based agents) collaborating with humans on solving such problems is being actively explored. While the abilities of LLMs can be improved by, for example, fine-tuning, hand-crafted system prompts and scaffolding with external tools, LLMs lack endogenous mechanisms to develop expertise through experience in such settings. This work address this gap with Dialectica, a framework where agents engage in structured dialogue on defined topics, augmented by memory, self-reflection, and policy-constrained context editing. Formally, discussion is viewed as an implicit meta-reinforcement learning process. The `dialogue-trained' agents are evaluated post-hoc using judged pairwise comparisons of elicited responses. Across two model architectures (locally run Qwen3:30b and OpenAI's o4-mini) results show that enabling reflection-based context editing during discussion produces agents which dominate their baseline counterparts on Elo scores, normalized Bradley-Terry-Davidson ability, and AlphaRank mass. The predicted signatures of learning are observed qualitatively in statement and reflection logs, where reflections identify weaknesses and reliably shape subsequent statements. Agreement between quantitative and qualitative evidence supports dialogue-driven context evolution as a practical path to targeted expertise amplification in open non-verifiable domains.

large language model, machine learning, natural language, (21 more...)

2510.15772

Country: North America > United States (0.67)

Genre: Research Report > New Finding (0.34)

Industry:

Law > Environmental Law (1.00)
Government (1.00)
Energy (1.00)
Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Koncel-Kedziorski, Rik, Joshi, Brihi, Paek, Tim

PrimeX: A Dataset of Worldview, Opinion, and Explanation

arXiv.org Artificial IntelligenceOct-2-2025

As the adoption of language models advances, so does the need to better represent individual users to the model. Are there aspects of an individual's belief system that a language model can utilize for improved alignment? Following prior research, we investigate this question in the domain of opinion prediction by developing PrimeX, a dataset of public opinion survey data from 858 US residents with two additional sources of belief information: written explanations from the respondents for why they hold specific opinions, and the Primal World Belief survey for assessing respondent worldview. We provide an extensive initial analysis of our data and show the value of belief explanations and worldview for personalizing language models. Our results demonstrate how the additional belief information in PrimeX can benefit both the NLP and psychological research communities, opening up avenues for further study.

explanation, large language model, machine learning, (18 more...)

2510.00174

Country:

Asia > Middle East > UAE (0.28)
North America > United States > California (0.28)

Genre:

Research Report > New Finding (1.00)
Questionnaire & Opinion Survey (1.00)

Industry:

Education (0.93)
Health & Medicine > Consumer Health (0.68)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.67)
Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)

arXiv.org Artificial IntelligenceSep-26-2025

Autoregressive End-to-End Planning with Time-Invariant Spatial Alignment and Multi-Objective Policy Refinement

Zhao, Jianbo, Ban, Taiyu, Li, Xiangjie, Gui, Xingtai, Zhou, Hangning, Liu, Lei, Zhao, Hongwei, Li, Bin

The inherent sequential modeling capabilities of autoregressive models make them a formidable baseline for end-to-end planning in autonomous driving. Nevertheless, their performance is constrained by a spatio-temporal misalignment, as the planner must condition future actions on past sensory data. This creates an inconsistent worldview, limiting the upper bound of performance for an otherwise powerful approach. To address this, we propose a Time-Invariant Spatial Alignment (TISA) module that learns to project initial environmental features into a consistent ego-centric frame for each future time step, effectively correcting the agent's worldview without explicit future scene prediction. In addition, we employ a kinematic action prediction head (i.e., acceleration and yaw rate) to ensure physically feasible trajectories. Finally, we introduce a multi-objective post-training stage using Direct Preference Optimization (DPO) to move beyond pure imitation. Our approach provides targeted feedback on specific driving behaviors, offering a more fine-grained learning signal than the single, overall objective used in standard DPO. Our model achieves a state-of-the-art 89.8 PDMS on the NAVSIM dataset among autoregressive models. The video document is available at https://tisa-dpo-e2e.github.io/.

machine learning, natural language, trajectory, (15 more...)

2509.20938

Country: Asia > China (0.28)

Genre: Research Report (1.00)

Industry: Information Technology (0.36)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.94)
(2 more...)

Kröger, Paul, Barkett, Emilio

Don't Change My View: Ideological Bias Auditing in Large Language Models

arXiv.org Artificial IntelligenceSep-17-2025

As large language models (LLMs) become increasingly embedded in products used by millions, their outputs may influence individual beliefs and, cumulatively, shape public opinion. If the behavior of LLMs can be intentionally steered toward specific ideological positions, such as political or religious views, then those who control these systems could gain disproportionate influence over public discourse. Although it remains an open question whether LLMs can reliably be guided toward coherent ideological stances and whether such steering can be effectively prevented, a crucial first step is to develop methods for detecting when such steering attempts occur. In this work, we adapt a previously proposed statistical method to the new context of ideological bias auditing. Our approach carries over the model-agnostic design of the original framework, which does not require access to the internals of the language model. Instead, it identifies potential ideological steering by analyzing distributional shifts in model outputs across prompts that are thematically related to a chosen topic. This design makes the method particularly suitable for auditing proprietary black-box systems. We validate our approach through a series of experiments, demonstrating its practical applicability and its potential to support independent post hoc audits of LLM behavior.

large language model, machine learning, system prompt, (20 more...)

2509.12652

Country:

North America > United States (1.00)
Asia (0.93)

Genre:

Research Report (1.00)
Personal > Interview (0.46)

Industry:

Health & Medicine (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Law (0.93)
Transportation (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

New ScientistJul-24-2025, 20:00:04 GMT

Why Trump's order targeting 'woke' AI may be impossible to follow

President Donald Trump wants to ensure the US government only gives federal contracts to artificial intelligence developers whose systems are "free from ideological bias". But the new requirements could allow his administration to impose its own worldview on tech companies' AI models – and companies may face significant challenges and risks in trying to modify their models to comply. "The suggestion that government contracts should be structured to ensure AI systems are'objective' and'free from top-down ideological bias' prompts the question: objective according to whom?" says Becca Branum at the Center for Democracy & Technology, a public policy non-profit in Washington DC. The Trump White House's AI Action Plan, released on 23 July, recommends updating federal guidelines "to ensure that the government only contracts with frontier large language model (LLM) developers who ensure that their systems are objective and free from top-down ideological bias". Trump signed a related executive order titled "Preventing Woke AI in the Federal Government" on the same day.

artificial intelligence, large language model, natural language, (18 more...)

New Scientist

Country:

North America > United States > District of Columbia > Washington (0.26)
Europe > Italy (0.05)

Industry: Government > Regional Government > North America Government > United States Government (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.92)

Jin, Weina, Vincent, Nicholas, Hamarneh, Ghassan

AI for Just Work: Constructing Diverse Imaginations of AI beyond "Replacing Humans"

arXiv.org Artificial IntelligenceMar-10-2025

The AI community usually focuses on "how" to develop AI techniques, but lacks thorough open discussions on "why" we develop AI. Lacking critical reflections on the general visions and purposes of AI may make the community vulnerable to manipulation. In this position paper, we explore the "why" question of AI. We denote answers to the "why" question the imaginations of AI, which depict our general visions, frames, and mindsets for the prospects of AI. We identify that the prevailing vision in the AI community is largely a monoculture that emphasizes objectives such as replacing humans and improving productivity. Our critical examination of this mainstream imagination highlights its underpinning and potentially unjust assumptions. We then call to diversify our collective imaginations of AI, embedding ethical assumptions from the outset in the imaginations of AI. To facilitate the community's pursuit of diverse imaginations, we demonstrate one process for constructing a new imagination of "AI for just work," and showcase its application in the medical image synthesis task to make it more ethical. We hope this work will help the AI community to open dialogues with civil society on the visions and purposes of AI, and inspire more technical works and advocacy in pursuit of diverse and ethical imaginations to restore the value of AI for the public good.

constructing diverse imagination, imagination, knowledge, (15 more...)

2503.0872

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > California (0.14)
(11 more...)

Genre:

Collection (0.93)
Research Report (0.64)

Industry:

Law (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Government (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)