The selection committee for the ACM SIGAI Industry Award for Excellence in Artificial Intelligence (AI) is pleased to announce that the Decision Service created by the Real World Reinforcement Learning Team from Microsoft, has been chosen as the winner of the inaugural 2019 award. The committee was impressed with the identification and development of cutting-edge research on contextual-bandit learning, the manifest cooperation between research and development efforts, the applicability of the decision support throughout the broad range of Microsoft products, and the quality of the final systems. All these aspects made the Microsoft team well worthy of this award. See the call for nominations.
About a year ago I was at a tech conference and there was one topic that threatened to overwhelm all others: no matter how a conversation started, it always ended up being about the fear and uncertainty of what would happen when the robots take over our jobs. Last month I attended O'Reilly's Artificial Intelligence conference in San Francisco and, perhaps not unexpectedly, the dominant topics were completely different.
Human beings begin to learn the difference before we learn to speak--and thankfully so. We owe much of our success as a species to our capacity for moral reasoning. It's the glue that holds human social groups together, the key to our fraught but effective ability to cooperate. We are (most believe) the lone moral agents on planet Earth--but this may not last. The day may come soon when we are forced to share this status with a new kind of being, one whose intelligence is of our own design. Robots are coming, that much is sure. They are coming to our streets as self-driving cars, to our military as automated drones, to our homes as elder-care robots--and that's just to name a few on the horizon (Ten million households already enjoy cleaner floors thanks to a relatively dumb little robot called the Roomba). What we don't know is how smart they will eventually become.
Reinforcement learning is an increasingly popular machine learning technique that is particularly well suited for addressing problems within dynamic and adaptive environments. When paired with simulations, reinforcement learning is a powerful tool for training AI models that can help increase automation or optimize operational efficiency of sophisticated systems such as robotics, manufacturing, and supply chain logistics. However, moving from the games commonly used to demonstrate these techniques into real-world applications isn't always straightforward. Structuring solutions to move beyond purely data-driven training introduces all sorts of new complexity, requiring you to consider things like how to use simulations to target your learning objectives, what kinds of simulations are applicable, how to deal with long-running simulations, how to incorporate ongoing training refinement once deployed, how to account for scaling and performance, and ultimately how to bridge from simulation to the real world. I was recently able to talk about how to effectively leverage reinforcement learning in real-world use cases at the O'Reilly AI conference in San Francisco.
One idea to achieve incorrect but also unethical decisions. Reinforcement learning such goal is to collect enough ethical behavior data of human (Sutton and Barto 1998) is designed to tackle intricate acting toward the given goal, and then apply the inverse real-world problems in rather short time (Strehl et al. 2006; reinforcement learning (IRL) (Amin and Singh 2016; Evans, Brafman and Tennenholtz 2002) with a performance bound Stuhlmüller, and Goodman 2016; Ng, Russell, and others (Strehl, Li, and Littman 2009); however, it relies heavily on 2000; Sezener 2015) technique to learn an ethical agent that the quality of the reward functions provided as the inputs.