Expectation Alignment: Handling Reward Misspecification in the Presence of Expectation Mismatch

Neural Information Processing Systems 

Detecting and handling misspecified objectives, such as reward functions, has been widely recognized as one of the central challenges within the domain of Artificial Intelligence (AI) safety research.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found