AITopics | multiple reward function

Collaborating Authors

multiple reward function

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Dynamic Normativity: Necessary and Sufficient Conditions for Value Alignment

Corrêa, Nicholas Kluge

arXiv.org Artificial IntelligenceJun-18-2024

The critical inquiry pervading the realm of Philosophy, and perhaps extending its influence across all Humanities disciplines, revolves around the intricacies of morality and normativity. Surprisingly, in recent years, this thematic thread has woven its way into an unexpected domain, one not conventionally associated with pondering "what ought to be": the field of artificial intelligence (AI) research. Central to morality and AI, we find "alignment", a problem related to the challenges of expressing human goals and values in a manner that artificial systems can follow without leading to unwanted adversarial effects. More explicitly and with our current paradigm of AI development in mind, we can think of alignment as teaching human values to non-anthropomorphic entities trained through opaque, gradient-based learning techniques. This work addresses alignment as a technical-philosophical problem that requires solid philosophical foundations and practical implementations that bring normative theory to AI system development. To accomplish this, we propose two sets of necessary and sufficient conditions that, we argue, should be considered in any alignment process. While necessary conditions serve as metaphysical and metaethical roots that pertain to the permissibility of alignment, sufficient conditions establish a blueprint for aligning AI systems under a learning-based paradigm. After laying such foundations, we present implementations of this approach by using state-of-the-art techniques and methods for aligning general-purpose language systems. We call this framework Dynamic Normativity. Its central thesis is that any alignment process under a learning paradigm that cannot fulfill its necessary and sufficient conditions will fail in producing aligned systems.

artificial intelligence safety and security, preference permeate human intention, round-robin head-to-head tournament, (16 more...)

arXiv.org Artificial Intelligence

2406.11039

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.28)
Europe > Russia (0.14)
Asia > Russia (0.14)
(38 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)

Industry:

Leisure & Entertainment > Sports (1.00)
Law > Statutes (1.00)
Information Technology > Security & Privacy (1.00)
(9 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media > Crowdsourcing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
(12 more...)

Add feedback

Nonparametric Bayesian Inverse Reinforcement Learning for Multiple Reward Functions

Neural Information Processing SystemsMar-14-2024, 03:10:55 GMT

We present a nonparametric Bayesian approach to inverse reinforcement learning (IRL) for multiple reward functions. Most previous IRL algorithms assume that the behaviour data is obtained from an agent who is optimizing a single reward function, but this assumption is hard to guarantee in practice. Our approach is based on integrating the Dirichlet process mixture model into Bayesian IRL. We provide an efficient Metropolis-Hastings sampling algorithm utilizing the gradient of the posterior to estimate the underlying reward functions, and demonstrate that our approach outperforms previous ones via experiments on a number of problem domains.

behaviour data, reward function, trajectory, (14 more...)

Neural Information Processing Systems

Country: Asia > South Korea > Daejeon > Daejeon (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)

Add feedback

Nonparametric Bayesian Inverse Reinforcement Learning for Multiple Reward Functions

Neural Information Processing SystemsApr-6-2023, 12:12:01 GMT

We present a nonparametric Bayesian approach to inverse reinforcement learning (IRL) for multiple reward functions. Most previous IRL algorithms assume that the behaviour data is obtained from an agent who is optimizing a single reward function, but this assumption is hard to be met in practice. Our approach is based on integrating the Dirichlet process mixture model into Bayesian IRL. We provide an efficient Metropolis-Hastings sampling algorithm utilizing the gradient of the posterior to estimate the underlying reward functions, and demonstrate that our approach outperforms the previous ones via experiments on a number of problem domains.

multiple reward function, nonparametric bayesian inverse reinforcement learning

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Nonparametric Bayesian Inverse Reinforcement Learning for Multiple Reward Functions

Choi, Jaedeug, Kim, Kee-eung

Neural Information Processing SystemsFeb-14-2020, 21:41:46 GMT

We present a nonparametric Bayesian approach to inverse reinforcement learning (IRL) for multiple reward functions. Most previous IRL algorithms assume that the behaviour data is obtained from an agent who is optimizing a single reward function, but this assumption is hard to be met in practice. Our approach is based on integrating the Dirichlet process mixture model into Bayesian IRL. We provide an efficient Metropolis-Hastings sampling algorithm utilizing the gradient of the posterior to estimate the underlying reward functions, and demonstrate that our approach outperforms the previous ones via experiments on a number of problem domains. Papers published at the Neural Information Processing Systems Conference.

multiple reward function, nonparametric bayesian inverse reinforcement learning

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Linear Fitted-Q Iteration with Multiple Reward Functions

Lizotte, Daniel J. (University of Waterloo) | Bowling, Michael (University of Alberta) | Murphy, Susan A. (University of Michigan)

AAAI ConferencesJun-8-2013

We present a general and detailed development of an algorithm for finite-horizon fitted-Q iteration with an arbitrary number of reward signals and linear value function approximation using an arbitrary number of state features. This includes a detailed treatment of the 3-reward function case using triangulation primitives from computational geometry and a method for identifying globally dominated actions. We also present an example of how our methods can be used to construct a real-world decision aid by considering symptom reduction, weight gain, and quality of life in sequential treatments for schizophrenia. Finally, we discuss future directions in which to take this work that will further enable our methods to make a positive impact on the field of evidence-based clinical decision support.

linear fitted-q iteration, multiple reward function

AAAI Conferences

Twenty-Third International Conference on Automated Planning and Scheduling

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.53)

Technology:

Information Technology > Decision Support Systems (0.53)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.53)

Add feedback

Nonparametric Bayesian Inverse Reinforcement Learning for Multiple Reward Functions

Choi, Jaedeug, Kim, Kee-eung

Neural Information Processing SystemsDec-31-2012

We present a nonparametric Bayesian approach to inverse reinforcement learning (IRL) for multiple reward functions. Most previous IRL algorithms assume that the behaviour data is obtained from an agent who is optimizing a single reward function, but this assumption is hard to be met in practice. Our approach is based on integrating the Dirichlet process mixture model into Bayesian IRL. We provide an efficient Metropolis-Hastings sampling algorithm utilizing the gradient of the posterior to estimate the underlying reward functions, and demonstrate that our approach outperforms the previous ones via experiments on a number of problem domains.

Add feedback