AI Is Using Your Likes to Get Inside Your Head

WIRED 

What is the future of the like button in the age of artificial intelligence? Max Levchin--the PayPal cofounder and Affirm CEO--sees a new and hugely valuable role for liking data to train AI to arrive at conclusions more in line with those a human decisionmaker would make. It's a well-known quandary in machine learning that a computer presented with a clear reward function will engage in relentless reinforcement learning to improve its performance and maximize that reward--but that this optimization path often leads AI systems to very different outcomes than would result from humans exercising human judgment. To introduce a corrective force, AI developers frequently use what is called reinforcement learning from human feedback (RLHF). Essentially they are putting a human thumb on the scale as the computer arrives at its model by training it on data reflecting real people's actual preferences.