AITopics | interpretable policy

Collaborating Authors

interpretable policy

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Information Templates: A New Paradigm for Intelligent Active Feature Acquisition

Huang, Hung-Tien, Dinh, Dzung, Oliva, Junier B.

arXiv.org Artificial IntelligenceAug-27-2025

Active feature acquisition (AFA) is an instance-adaptive paradigm in which, at test time, a policy sequentially chooses which features to acquire (at a cost) before predicting. Existing approaches either train reinforcement learning (RL) policies, which deal with a difficult MDP, or greedy policies that cannot account for the joint informativeness of features or require knowledge about the underlying data distribution. To overcome this, we propose Template-based AFA (TAFA), a non-greedy framework that learns a small library of feature templates--a set of features that are jointly informative--and uses this library of templates to guide the next feature acquisitions. Through identifying feature templates, the proposed framework not only significantly reduces the action space considered by the policy but also alleviates the need to estimate the underlying data distribution. Extensive experiments on synthetic and real-world datasets show that TAFA outperforms the existing state-of-the-art baselines while achieving lower overall acquisition cost and computation.

machine learning, reinforcement learning, template, (18 more...)

arXiv.org Artificial Intelligence

2508.1838

Genre: Research Report (0.82)

Industry:

Health & Medicine (0.68)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.66)

Add feedback

"So, Tell Me About Your Policy...": Distillation of interpretable policies from Deep Reinforcement Learning agents

Dispoto, Giovanni, Bonetti, Paolo, Restelli, Marcello

arXiv.org Artificial IntelligenceJul-30-2025

Recent advances in Reinforcement Learning (RL) largely benefit from the inclusion of Deep Neural Networks, boosting the number of novel approaches proposed in the field of Deep Reinforcement Learning (DRL). These techniques demonstrate the ability to tackle complex games such as Atari, Go, and other real-world applications, including financial trading. Nevertheless, a significant challenge emerges from the lack of interpretability, particularly when attempting to comprehend the underlying patterns learned, the relative importance of the state features, and how they are integrated to generate the policy's output. For this reason, in mission-critical and real-world settings, it is often preferred to deploy a simpler and more interpretable algorithm, although at the cost of performance. In this paper, we propose a novel algorithm, supported by theoretical guarantees, that can extract an interpretable policy (e.g., a linear policy) without disregarding the peculiarities of expert behavior. This result is obtained by considering the advantage function, which includes information about why an action is superior to the others. In contrast to previous works, our approach enables the training of an interpretable policy using previously collected experience. The proposed algorithm is empirically evaluated on classic control environments and on a financial trading scenario, demonstrating its ability to extract meaningful information from complex expert policies.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2507.07848

Country: North America > United States (0.28)

Genre:

Research Report > Promising Solution (0.34)
Overview > Innovation (0.34)

Industry: Banking & Finance > Trading (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Evaluating Interpretable Reinforcement Learning by Distilling Policies into Programs

Kohler, Hector, Delfosse, Quentin, Radji, Waris, Akrour, Riad, Preux, Philippe

arXiv.org Artificial IntelligenceMar-11-2025

There exist applications of reinforcement learning like medicine where policies need to be ''interpretable'' by humans. User studies have shown that some policy classes might be more interpretable than others. However, it is costly to conduct human studies of policy interpretability. Furthermore, there is no clear definition of policy interpretabiliy, i.e., no clear metrics for interpretability and thus claims depend on the chosen definition. We tackle the problem of empirically evaluating policies interpretability without humans. Despite this lack of clear definition, researchers agree on the notions of ''simulatability'': policy interpretability should relate to how humans understand policy actions given states. To advance research in interpretable reinforcement learning, we contribute a new methodology to evaluate policy interpretability. This new methodology relies on proxies for simulatability that we use to conduct a large-scale empirical evaluation of policy interpretability. We use imitation learning to compute baseline policies by distilling expert neural networks into small programs. We then show that using our methodology to evaluate the baselines interpretability leads to similar conclusions as user studies. We show that increasing interpretability does not necessarily reduce performances and can sometimes increase them. We also show that there is no policy class that better trades off interpretability and performance across tasks making it necessary for researcher to have methodologies for comparing policies interpretability.

interpretability, policy class, reinforcement, (14 more...)

arXiv.org Artificial Intelligence

2503.08322

Country:

Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

From Explainability to Interpretability: Interpretable Policies in Reinforcement Learning Via Model Explanation

Li, Peilang, Siddique, Umer, Cao, Yongcan

arXiv.org Artificial IntelligenceJan-16-2025

Deep reinforcement learning (RL) has shown remarkable success in complex domains, however, the inherent black box nature of deep neural network policies raises significant challenges in understanding and trusting the decision-making processes. While existing explainable RL methods provide local insights, they fail to deliver a global understanding of the model, particularly in high-stakes applications. To overcome this limitation, we propose a novel model-agnostic approach that bridges the gap between explainability and interpretability by leveraging Shapley values to transform complex deep RL policies into transparent representations. The proposed approach offers two key contributions: a novel approach employing Shapley values to policy interpretation beyond local explanations and a general framework applicable to off-policy and on-policy algorithms. We evaluate our approach with three existing deep RL algorithms and validate its performance in two classic control environments. The results demonstrate that our approach not only preserves the original models' performance but also generates more stable interpretable policies.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2501.09858

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > Texas (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)

Genre: Research Report > Promising Solution (0.88)

Industry:

Leisure & Entertainment > Games (0.68)
Law (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Towards a Research Community in Interpretable Reinforcement Learning: the InterpPol Workshop

Kohler, Hector, Delfosse, Quentin, Festor, Paul, Preux, Philippe

arXiv.org Artificial IntelligenceApr-16-2024

Embracing the pursuit of intrinsically explainable reinforcement learning raises crucial questions: what distinguishes explainability from interpretability? Should explainable and interpretable agents be developed outside of domains where transparency is imperative? What advantages do interpretable policies offer over neural networks? How can we rigorously define and measure interpretability in policies, without user studies? What reinforcement learning paradigms,are the most suited to develop interpretable agents? Can Markov Decision Processes integrate interpretable state representations? In addition to motivate an Interpretable RL community centered around the aforementioned questions, we propose the first venue dedicated to Interpretable RL: the InterpPol Workshop.

interpretable policy, reinforcement, reinforcement learning, (11 more...)

arXiv.org Artificial Intelligence

2404.10906

Country:

Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.05)
Europe > France > Hauts-de-France > Nord > Lille (0.05)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)

Genre:

Research Report (0.50)
Overview (0.47)

Industry: Education (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Boolean Decision Rules for Reinforcement Learning Policy Summarisation

McCarthy, James, Nair, Rahul, Daly, Elizabeth, Marinescu, Radu, Dusparic, Ivana

arXiv.org Artificial IntelligenceJul-18-2022

Explainability of Reinforcement Learning (RL) policies remains a challenging research problem, particularly when considering RL in a safety context. Understanding the decisions and intentions of an RL policy offer avenues to incorporate safety into the policy by limiting undesirable actions. We propose the use of a Boolean Decision Rules model to create a post-hoc rule-based summary of an agent's policy. We evaluate our proposed approach using a DQN agent trained on an implementation of a lava gridworld and show that, given a hand-crafted feature representation of this gridworld, simple generalised rules can be created, giving a post-hoc explainable summary of the agent's policy. We discuss possible avenues to introduce safety into a RL agent's policy by using rules generated by this rule-based model as constraints imposed on the agent's policy, as well as discuss how creating simple rule summaries of an agent's policy may help in the debugging process of RL agents.

agent, bdr model, boolean decision rule, (12 more...)

arXiv.org Artificial Intelligence

2207.08651

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Distilling Heterogeneity: From Explanations of Heterogeneous Treatment Effect Models to Interpretable Policies

Wu, Han, Tan, Sarah, Li, Weiwei, Garrard, Mia, Obeng, Adam, Dimmery, Drew, Singh, Shaun, Wang, Hanson, Jiang, Daniel, Bakshy, Eytan

arXiv.org Machine LearningNov-5-2021

Internet companies are increasingly using machine learning models to create personalized policies which assign, for each individual, the best predicted treatment for that individual. They are frequently derived from black-box heterogeneous treatment effect (HTE) models that predict individual-level treatment effects. In this paper, we focus on (1) learning explanations for HTE models; (2) learning interpretable policies that prescribe treatment assignments. We also propose guidance trees, an approach to ensemble multiple interpretable policies without the loss of interpretability. These rule-based interpretable policies are easy to deploy and avoid the need to maintain a HTE model in a production environment.

hte model, treatment effect, treatment group, (16 more...)

arXiv.org Machine Learning

2111.03267

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)

Genre:

Research Report > Strength High (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (0.34)

Add feedback

Explainable Autonomous Robots: A Survey and Perspective

Sakai, Tatsuya, Nagai, Takayuki

arXiv.org Artificial IntelligenceMay-6-2021

It is commonly claimed that AI will replace most manual labor in the future; however, is this really the case? AI technologies do have higher image recognition accuracy compared to humans in some limited contexts, and have consistently outperformed humans in classical games such as Go and chess. Nonetheless, we believe that even advanced future developments based on current technology will not lead to robots replacing humans. AI systems' fundamental lack of ability to communicate naturally and effectively with humans is among the most significant reasons that they cannot replace human labor. Here, one may believe that such communication could be achieved via the development of natural language processing (NLP) technology [4]; however, NLP technologies are systems for estimating the content of human statements and their meanings; they do not constitute communication. That is, humans do not feel that robots using such systems truly understand and respond to them appropriately. Therefore, if effective communication is not achieved, robots will continue to function only as tools to assist humans. Advancements improving the accuracy or effectiveness of various specific tasks do not indicate that robots are equivalent to human beings. Under this scenario, how can we enable robots to communicate with humans?

decision-making space, explanation, robot, (17 more...)

arXiv.org Artificial Intelligence

2105.02658

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > New York > New York County > New York City (0.04)
Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)
(3 more...)

Genre:

Overview (1.00)
Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
(6 more...)

Add feedback

Differentiable Logic Machines

Zimmer, Matthieu, Feng, Xuening, Glanois, Claire, Jiang, Zhaohui, Zhang, Jianyi, Weng, Paul, Jianye, Hao, Dong, Li, Wulong, Liu

arXiv.org Artificial IntelligenceFeb-24-2021

The integration of reasoning, learning, and decision-making is key to build more general AI systems. As a step in this direction, we propose a novel neural-logic architecture that can solve both inductive logic programming (ILP) and deep reinforcement learning (RL) problems. Our architecture defines a restricted but expressive continuous space of first-order logic programs by assigning weights to predicates instead of rules. Therefore, it is fully differentiable and can be efficiently trained with gradient descent. Besides, in the deep RL setting with actor-critic algorithms, we propose a novel efficient critic architecture. Compared to state-of-the-art methods on both ILP and RL problems, our proposition achieves excellent performance, while being able to provide a fully interpretable solution and scaling much better, especially during the testing phase.

architecture, dlm, predicate, (17 more...)

arXiv.org Artificial Intelligence

2102.11529

Country:

Asia > China > Shanghai > Shanghai (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture > Yokohama (0.04)
Asia > China > Tianjin Province > Tianjin (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Add feedback

Reinforcement Learning from a Mixture of Interpretable Experts

Akrour, Riad, Tateo, Davide, Peters, Jan

arXiv.org Machine LearningJun-10-2020

Reinforcement learning (RL) has demonstrated its ability to solve high dimensional tasks by leveraging non-linear function approximators. These successes however are mostly achieved by 'black-box' policies in simulated domains. When deploying RL to the real world, several concerns regarding the use of a 'black-box' policy might be raised. In an effort to make the policies learned by RL more transparent, we propose in this paper a policy iteration scheme that retains a complex function approximator for its internal value predictions but constrains the policy to have a concise, hierarchical, and human-readable structure, based on a mixture of interpretable experts. We show that our proposed algorithm can learn compelling policies on continuous action deep RL benchmarks, matching the performance of neural network based policies, but returns policies that are more amenable to human inspection than neural network or linear-in-feature policies.

cluster center, machine learning, reinforcement learning, (16 more...)

arXiv.org Machine Learning

2006.05911

Country:

Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry:

Transportation (0.68)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.94)

Add feedback