Non-Markovian Reward Modelling from Trajectory Labels via Interpretable Multiple Instance Learning Joseph Early Tom Bewley Christine Evers

Aug-17-2025, 20:21:13 GMT–Neural Information Processing Systems

In this work, we remove this assumption, extending RM to capture temporal dependencies in human assessment of trajectories.

machine learning, prediction, reinforcement learning, (16 more...)

Neural Information Processing Systems

Aug-17-2025, 20:21:13 GMT

Conferences PDF

Country:
- Europe > United Kingdom > England
  - Hampshire > Southampton (0.04)
  - Bristol (0.04)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Machine Learning
    - Neural Networks > Deep Learning (0.82)
    - Reinforcement Learning (0.71)

Duplicate Docs Excel Report

Title
Non-MarkovianRewardModellingfromTrajectory LabelsviaInterpretableMultipleInstanceLearning

Similar Docs Excel Report more

Title	Similarity	Source
None found