AITopics

Country:

North America > Canada > Ontario > Toronto (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Neural Information Processing SystemsFeb-9-2026, 16:55:25 GMT

PlanningwithGeneralObjectiveFunctions: GoingBeyondTotalRewards

Note that inthis simple example, the state transition functionT and the reward functionr stillsatisfy theMarkovproperty.

artificial intelligence, machine learning, reward value, (17 more...)

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Asia > Middle East > Jordan (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsFeb-9-2026, 16:55:18 GMT

Planningwith General Objective Functions: Going Beyond Total Rewards

O((|S ||A|+ T) H ( log ( 1/")/")). ItisalsoeasyV ( , )andQ ( , , )obtained algorithm.

artificial intelligence, machine learning, neural information processing system, (11 more...)

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.05)
Asia > Middle East > Jordan (0.05)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.32)

Neural Information Processing SystemsOct-9-2025, 16:02:38 GMT

7b24015f3af598e1d9179f6e06353780-Paper-Conference.pdf

large language model, logic & formal reasoning, machine learning, (22 more...)

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)
(8 more...)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.71)
Information Technology > Artificial Intelligence > Robots (0.68)
(4 more...)

arXiv.org Artificial IntelligenceMay-8-2025

Proceedings of 1st Workshop on Advancing Artificial Intelligence through Theory of Mind

Abrini, Mouad, Abend, Omri, Acklin, Dina, Admoni, Henny, Aichinger, Gregor, Alon, Nitay, Ashktorab, Zahra, Atreja, Ashish, Auron, Moises, Aufreiter, Alexander, Awasthi, Raghav, Banerjee, Soumya, Barnby, Joe M., Basappa, Rhea, Bergsmann, Severin, Bouneffouf, Djallel, Callaghan, Patrick, Cavazza, Marc, Chaminade, Thierry, Chernova, Sonia, Chetouan, Mohamed, Choudhury, Moumita, Cleeremans, Axel, Cywinski, Jacek B., Cuzzolin, Fabio, Deng, Hokin, Diamond, N'yoma, Di Pasquasio, Camilla, Dumas, Guillaume, van Duijn, Max, Dwarikanath, Mahapatra, Gao, Qingying, Goel, Ashok, Goldstein, Rebecca, Gombolay, Matthew, Gonzalez, Gabriel Enrique, Halilovic, Amar, Halmdienst, Tobias, Islam, Mahimul, Jara-Ettinger, Julian, Kastel, Natalie, Keydar, Renana, Khanna, Ashish K., Khoramshahi, Mahdi, Kim, JiHyun, Kim, MiHyeon, Kim, YoungBin, Krivic, Senka, Krasnytskyi, Nikita, Kumar, Arun, Kwon, JuneHyoung, Lee, Eunju, Lee, Shane, Lewis, Peter R., Li, Xue, Li, Yijiang, Lewandowski, Michal, Lloyd, Nathan, Luebbers, Matthew B., Luo, Dezhi, Lyu, Haiyun, Mahapatra, Dwarikanath, Maheshwari, Kamal, Mainali, Mallika, Mathur, Piyush, Mederitsch, Patrick, Miura, Shuwa, de Miranda, Manuel Preston, Mirsky, Reuth, Mishra, Shreya, Moorman, Nina, Morrison, Katelyn, Muchovej, John, Nessler, Bernhard, Nessler, Felix, Nguyen, Hieu Minh Jord, Ortego, Abby, Papay, Francis A., Pasquali, Antoine, Rahimi, Hamed, Raghu, Charumathi, Royka, Amanda, Sarkadi, Stefan, Scheuerman, Jaelle, Schmid, Simon, Schrater, Paul, Sen, Anik, Sheikhbahaee, Zahra, Shi, Ke, Simmons, Reid, Singh, Nishant, Smith, Mason O., van der Meulen, Ramira, Solaki, Anthia, Sun, Haoran, Szolga, Viktor, Taylor, Matthew E., Taylor, Travis, Van Waveren, Sanne, Vargas, Juan David, Verbrugge, Rineke, Wagner, Eitan, Weisz, Justin D., Wen, Ximing, Yeoh, William, Zhang, Wenlong, Zhao, Michelle, Zilberstein, Shlomo

The ability to attribute mental states--such as beliefs, intentions, desires, and emotions--to oneself and others, is essential for predicting behavior. Thus ToM principles are crucial to enable better interpretation and response to human actions and intentions as AI systems evolve towards greater interactivity. The purpose of this volume is to provide an open access and curated anthology for the ToM and AI research community. The first Theory of Mind for AI (ToM4AI) workshop took place on March 3, 2025, as part of the AAAI workshop series. It was an epic gathering of researchers from diverse fields, ranging from psychology, cognitive science, neuroscience, robotics, and AI, to explore the implications of ToM in developing advanced AI systems.

large language model, machine learning, reinforcement learning, (17 more...)

2505.0377

Country: North America > Canada > Ontario > Toronto (0.15)

Genre: Instructional Material > Course Syllabus & Notes (0.49)

Industry: Health & Medicine (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.49)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.33)

Klassen, Toryn Q., Alamdari, Parand A., McIlraith, Sheila A.

Pluralistic Alignment Over Time

arXiv.org Artificial IntelligenceNov-15-2024

If an AI system makes decisions over time, how should we evaluate how aligned it is with a group of stakeholders (who may have conflicting values and preferences)? In this position paper, we advocate for consideration of temporal aspects including stakeholders' changing levels of satisfaction and their possibly temporally extended preferences. We suggest how a recent approach to evaluating fairness over time could be applied to a new form of pluralistic alignment: temporal pluralism, where the AI system reflects different stakeholders' values at different times.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

2411.10654

Country:

North America > Canada > Ontario > Toronto (0.15)
North America > United States (0.14)

Genre: Research Report (0.50)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.31)

Hyde, Gregory, Santos, Eugene Jr

Detecting Hidden Triggers: Mapping Non-Markov Reward Functions to Markov

arXiv.org Artificial IntelligenceJan-20-2024

Many Reinforcement Learning algorithms assume a Markov reward function to guarantee optimality. However, not all reward functions are known to be Markov. In this paper, we propose a framework for mapping non-Markov reward functions into equivalent Markov ones by learning a Reward Machine - a specialized reward automaton. Unlike the general practice of learning Reward Machines, we do not require a set of high-level propositional symbols from which to learn. Rather, we learn \emph{hidden triggers} directly from data that encode them. We demonstrate the importance of learning Reward Machines versus their Deterministic Finite-State Automata counterparts, for this task, given their ability to model reward dependencies in a single automaton. We formalize this distinction in our learning objective. Our mapping process is constructed as an Integer Linear Programming problem. We prove that our mappings provide consistent expectations for the underlying process. We empirically validate our approach by learning black-box non-Markov Reward functions in the Officeworld Domain. Additionally, we demonstrate the effectiveness of learning dependencies between rewards in a new domain, Breakfastworld.

amdp, equation, trajectory, (17 more...)

2401.11325

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Indiana (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)
(3 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Muise, Christian, McIlraith, Sheila A., Beck, J. Christopher

PRP Rebooted: Advancing the State of the Art in FOND Planning

arXiv.org Artificial IntelligenceDec-19-2023

Fully Observable Non-Deterministic (FOND) planning is a variant of classical symbolic planning in which actions are nondeterministic, with an action's outcome known only upon execution. It is a popular planning paradigm with applications ranging from robot planning to dialogue-agent design and reactive synthesis. Over the last 20 years, a number of approaches to FOND planning have emerged. In this work, we establish a new state of the art, following in the footsteps of some of the most powerful FOND planners to date. Our planner, PR2, decisively outperforms the four leading FOND planners, at times by a large margin, in 17 of 18 domains that represent a comprehensive benchmark suite. Ablation studies demonstrate the impact of various techniques we introduce, with the largest improvement coming from our novel FOND-aware heuristic.

eachable, node, ontroller, (15 more...)

2312.11675

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
(3 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.93)

arXiv.org Artificial IntelligenceJan-30-2023

Team Plan Recognition: A Review of the State of the Art

Rieffer-Champlin, Loren

There is an increasing need to develop artificial intelligence systems that assist groups of humans working on coordinated tasks. These systems must recognize and understand the plans and relationships between actions for a team of humans working toward a common objective. This article reviews the literature on team plan recognition and surveys the most recent logic-based approaches for implementing it. First, we provide some background knowledge, including a general definition of plan recognition in a team setting and a discussion of implementation challenges. Next, we explain our reasoning for focusing on logic-based methods. Finally, we survey recent approaches from two primary classes of logic-based methods (plan library-based and domain theory-based). We aim to bring more attention to this sparse but vital topic and inspire new directions for implementing team plan recognition.

artificial intelligence, planning & scheduling, recognition, (14 more...)

doi: 10.54941/ahfe1003557

2301.13288

Country:

North America > United States > Arizona > Pima County > Tucson (0.14)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(2 more...)

Genre: Overview (1.00)

Industry: Government (0.68)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling > Plan Recognition (1.00)

Christoffersen, Phillip J. K., Li, Andrew C., Icarte, Rodrigo Toro, McIlraith, Sheila A.

Learning Symbolic Representations for Reinforcement Learning of Non-Markovian Behavior

arXiv.org Artificial IntelligenceJan-7-2023

Many real-world reinforcement learning (RL) problems necessitate learning complex, temporally extended behavior that may only receive reward signal when the behavior is completed. If the reward-worthy behavior is known, it can be specified in terms of a non-Markovian reward function - a function that depends on aspects of the state-action history, rather than just the current state and action. Such reward functions yield sparse rewards, necessitating an inordinate number of experiences to find a policy that captures the reward-worthy pattern of behavior. Recent work has leveraged Knowledge Representation (KR) to provide a symbolic abstraction of aspects of the state that summarize reward-relevant properties of the state-action history and support learning a Markovian decomposition of the problem in terms of an automaton over the KR. Providing such a decomposition has been shown to vastly improve learning rates, especially when coupled with algorithms that exploit automaton structure. Nevertheless, such techniques rely on a priori knowledge of the KR. In this work, we explore how to automatically discover useful state abstractions that support learning automata over the state-action history. The result is an end-to-end algorithm that can learn optimal policies with significantly fewer environment samples than state-of-the-art RL on simple non-Markovian domains.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

2301.02952

Country:

North America > Canada > Ontario > Toronto (0.15)
South America > Chile (0.04)

Genre: Research Report (0.83)

Industry: Government (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)