AITopics | Röpke, Willem

Collaborating Authors

Röpke, Willem

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Scalable Multi-Objective Reinforcement Learning with Fairness Guarantees using Lorenz Dominance

Michailidis, Dimitris, Röpke, Willem, Roijers, Diederik M., Ghebreab, Sennay, Santos, Fernando P.

arXiv.org Artificial IntelligenceNov-27-2024

Multi-Objective Reinforcement Learning (MORL) aims to learn a set of policies that optimize trade-offs between multiple, often conflicting objectives. MORL is computationally more complex than single-objective RL, particularly as the number of objectives increases. Additionally, when objectives involve the preferences of agents or groups, ensuring fairness is socially desirable. This paper introduces a principled algorithm that incorporates fairness into MORL while improving scalability to many-objective problems. We propose using Lorenz dominance to identify policies with equitable reward distributions and introduce {\lambda}-Lorenz dominance to enable flexible fairness preferences. We release a new, large-scale real-world transport planning environment and demonstrate that our method encourages the discovery of fair policies, showing improved scalability in two large cities (Xi'an and Amsterdam). Our methods outperform common multi-objective approaches, particularly in high-dimensional objective spaces.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2411.18195

Country:

Europe > Netherlands > North Holland > Amsterdam (0.26)
Asia > China > Shaanxi Province > Xi'an (0.26)

Genre: Research Report (1.00)

Industry: Transportation > Infrastructure & Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Divide and Conquer: Provably Unveiling the Pareto Front with Multi-Objective Reinforcement Learning

Röpke, Willem, Reymond, Mathieu, Mannion, Patrick, Roijers, Diederik M., Nowé, Ann, Rădulescu, Roxana

arXiv.org Artificial IntelligenceFeb-11-2024

A significant challenge in multi-objective reinforcement learning is obtaining a Pareto front of policies that attain optimal performance under different preferences. We introduce Iterated Pareto Referent Optimisation (IPRO), a principled algorithm that decomposes the task of finding the Pareto front into a sequence of single-objective problems for which various solution methods exist. This enables us to establish convergence guarantees while providing an upper bound on the distance to undiscovered Pareto optimal solutions at each step. Empirical evaluations demonstrate that IPRO matches or outperforms methods that require additional domain knowledge. By leveraging problem-specific single-objective solvers, our approach also holds promise for applications beyond multi-objective reinforcement learning, such as in pathfinding and optimisation.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2402.07182

Country:

Europe (0.67)
North America > United States > New York (0.14)
North America > United States > California (0.14)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Utility-Based Reinforcement Learning: Unifying Single-objective and Multi-objective Reinforcement Learning

Vamplew, Peter, Foale, Cameron, Hayes, Conor F., Mannion, Patrick, Howley, Enda, Dazeley, Richard, Johnson, Scott, Källström, Johan, Ramos, Gabriel, Rădulescu, Roxana, Röpke, Willem, Roijers, Diederik M.

arXiv.org Artificial IntelligenceFeb-4-2024

So far the flow of knowledge has primarily been from conventional single-objective RL (SORL) into MORL, with algorithmic Research in multi-objective reinforcement learning(MORL) has introduced innovations from SORL being adapted to the context of multiple the utility-based paradigm, which makes use of both environmental objectives [2, 6, 22, 34]. This paper runs counter to that trend, rewards and a function that defines the utility derived as we will argue that the utility-based paradigm which has been bytheuser from thoserewards. Inthis paperweextend this paradigm widely adopted in MORL [5, 13, 21], has both relevance and benefits to the context of single-objective reinforcement learning(RL), to SORL. We present a general framework for utility-based RL and outline multiple potential benefits including the ability to perform (UBRL), which unifies the SORL and MORL frameworks, and discuss multi-policy learning across tasks relating to uncertain objectives, benefits and potential applications of this for single-objective risk-aware RL, discounting, and safe RL. We also examine problems - in particular focusing on the novel potential UBRL offers the algorithmic implications of adopting a utility-based approach.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2402.02665

Country:

Oceania > New Zealand (0.15)
Europe > Sweden (0.14)
Europe > Ireland (0.14)
Europe > Belgium (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Distributional Multi-Objective Decision Making

Röpke, Willem, Hayes, Conor F., Mannion, Patrick, Howley, Enda, Nowé, Ann, Roijers, Diederik M.

arXiv.org Artificial IntelligenceJul-18-2023

For effective decision support in scenarios with conflicting objectives, sets of potentially optimal solutions can be presented to the decision maker. We explore both what policies these sets should contain and how such sets can be computed efficiently. With this in mind, we take a distributional approach and introduce a novel dominance criterion relating return distributions of policies directly. Based on this criterion, we present the distributional undominated set and show that it contains optimal policies otherwise ignored by the Pareto front. In addition, we propose the convex distributional undominated set and prove that it comprises all policies that maximise expected utility for multivariate risk-averse decision makers. We propose a novel algorithm to learn the distributional undominated set and further contribute pruning operators to reduce the set to the convex distributional undominated set. Through experiments, we demonstrate the feasibility and effectiveness of these methods, making this a valuable new approach for decision support in real-world problems.

artificial intelligence, decision support system, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2305.0556

Country: Europe > Belgium (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Decision Support Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(2 more...)

Add feedback

On Nash Equilibria in Normal-Form Games With Vectorial Payoffs

Röpke, Willem, Roijers, Diederik M., Nowé, Ann, Rădulescu, Roxana

arXiv.org Artificial IntelligenceJul-16-2022

We provide an in-depth study of Nash equilibria in multi-objective normal form games (MONFGs), i.e., normal form games with vectorial payoffs. Taking a utility-based approach, we assume that each player's utility can be modelled with a utility function that maps a vector to a scalar utility. In the case of a mixed strategy, it is meaningful to apply such a scalarisation both before calculating the expectation of the payoff vector as well as after. This distinction leads to two optimisation criteria. With the first criterion, players aim to optimise the expected value of their utility function applied to the payoff vectors obtained in the game. With the second criterion, players aim to optimise the utility of expected payoff vectors given a joint strategy. Under this latter criterion, it was shown that Nash equilibria need not exist. Our first contribution is to provide a sufficient condition under which Nash equilibria are guaranteed to exist. Secondly, we show that when Nash equilibria do exist under both criteria, no equilibrium needs to be shared between the two criteria, and even the number of equilibria can differ. Thirdly, we contribute a study of pure strategy Nash equilibria under both criteria. We show that when assuming quasiconvex utility functions for players, the sets of pure strategy Nash equilibria under both optimisation criteria are equivalent. This result is further extended to games in which players adhere to different optimisation criteria. Finally, given these theoretical results, we construct an algorithm to compute all pure strategy Nash equilibria in MONFGs where players have a quasiconvex utility function.

artificial intelligence, decision support system, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2112.065

Country: Europe (1.00)

Genre:

Overview (0.67)
Research Report (0.50)

Industry:

Leisure & Entertainment > Games (0.47)
Health & Medicine (0.46)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Decision Support Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback