Randomized Exploration for Reinforcement Learning with Multinomial Logistic Function Approximation

Mar-21-2026, 13:14:55 GMT–Neural Information Processing Systems

We study reinforcement learning with _multinomial logistic_ (MNL) function approximation where the underlying transition probability kernel of the _Markov decision processes_ (MDPs) is parametrized by an unknown transition core with features of state and action. For the finite horizon episodic setting with inhomogeneous state transitions, we propose provably efficient algorithms with randomized exploration having frequentist regret guarantees.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Mar-21-2026, 13:14:55 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.41)