Google finding ways to stop artificial intelligence from hacking its reward system

Jun-24-2016, 08:10:30 GMT–#artificialintelligence

That's just one of "five practical research problems" proposed by scientists at Google, OpenAI, Stanford and Berkeley in a paper called "Concrete Problems in AI Safety" (pdf). Others included "safe exploration" issues, or how to stop a curious cleaning robot from sticking a wet mop in an electrical socket, and "avoiding negative side effects" such as a robot breaking granny's vase when cleaning in a rush. The problems may seem a bit silly, when compared to an AI-induced doomsday, but Google researcher Chris Olah wrote, "These are all forward thinking, long-term research questions – minor issues today, but important to address for future systems." A particularly interesting portion of the paper was devoted to avoiding reward hacking, or how to stop AI from gaming its reward function. "Imagine that an agent discovers a buffer overflow in its reward function: it may then use this to get extremely high reward in an unintended way."

large language model, machine learning, natural language, (20 more...)

#artificialintelligence

Jun-24-2016, 08:10:30 GMT

News Web Page

Add feedback

Country:
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.05)

Genre:
- Research Report (0.57)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (0.40)
  - Machine Learning > Neural Networks
    - Deep Learning (0.55)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found