Non-MarkovianRewardModellingfromTrajectory LabelsviaInterpretableMultipleInstanceLearning
–Neural Information Processing Systems
There is growing consensus around the view that aligned and beneficial AI requires a reframing of objectives as being contingent, uncertain, and learnable via interaction with humans [35].
Neural Information Processing Systems
Feb-11-2026, 10:04:48 GMT
- Technology: