MetaCURL: Non-stationary Concave Utility Reinforcement Learning

Mar-22-2026, 17:01:36 GMT–Neural Information Processing Systems

We explore online learning in episodic loop-free Markov decision processes on non-stationary environments (changing losses and probability transitions). Our focus is on the Concave Utility Reinforcement Learning problem (CURL), an extension of classical RL for handling convex performance criteria in state-action distributions induced by agent policies. While various machine learning problems can be written as CURL, its non-linearity invalidates traditional Bellman equations.

artificial intelligence, machine learning, reinforcement learning, (6 more...)

Neural Information Processing Systems

Mar-22-2026, 17:01:36 GMT

Conferences Web Page

Add feedback

Industry:
- Education > Focused Education > Special Education (0.51)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.31)