Oceania
RUDDER: Return Decomposition for Delayed Rewards
Arjona-Medina, Jose A., Gillhofer, Michael, Widrich, Michael, Unterthiner, Thomas, Hochreiter, Sepp
We propose a novel reinforcement learning approach for finite Markov decision processes (MDPs) with delayed rewards. In this work, biases of temporal difference (TD) estimates are proved to be corrected only exponentially slowly in the number of delay steps. Furthermore, variances of Monte Carlo (MC) estimates are proved to increase the variance of other estimates, the number of which can exponentially grow in the number of delay steps. We introduce RUDDER, a return decomposition method, which creates a new MDP with same optimal policies as the original MDP but with redistributed rewards that have largely reduced delays. If the return decomposition is optimal, then the new MDP does not have delayed rewards and TD estimates are unbiased. In this case, the rewards track Q-values so that the future expected reward is always zero. We experimentally confirm our theoretical results on bias and variance of TD and MC estimates. On artificial tasks with different lengths of reward delays, we show that RUDDER is exponentially faster than TD, MC, and MC Tree Search (MCTS). RUDDER outperforms rainbow, A3C, DDQN, Distributional DQN, Dueling DDQN, Noisy DQN, and Prioritized DDQN on the delayed reward Atari game Venture in only a fraction of the learning time. RUDDER considerably improves the state-of-the-art on the delayed reward Atari game Bowling in much less learning time. Source code is available at https://github.com/ml-jku/baselines-rudder, with demonstration videos at https://goo.gl/EQerZV.
Towards a Grounded Dialog Model for Explainable Artificial Intelligence
Madumal, Prashan, Miller, Tim, Vetere, Frank, Sonenberg, Liz
To generate trust with their users, Explainable Artificial Intelligence (XAI) systems need to include an explanation model that can communicate the internal decisions, behaviours and actions to the interacting humans. Successful explanation involves both cognitive and social processes. In this paper we focus on the challenge of meaningful interaction between an explainer and an explainee and investigate the structural aspects of an explanation in order to propose a human explanation dialog model. We follow a bottom-up approach to derive the model by analysing transcripts of 398 different explanation dialog types. We use grounded theory to code and identify key components of which an explanation dialog consists. We carry out further analysis to identify the relationships between components and sequences and cycles that occur in a dialog. We present a generalized state model obtained by the analysis and compare it with an existing conceptual dialog model of explanation.
A Scalable Framework for Trajectory Prediction
Rathore, Punit, Kumar, Dheeraj, Rajasegarar, Sutharshan, Palaniswami, Marimuthu, Bezdek, James C.
Trajectory prediction (TP) is of great importance for a wide range of location-based applications in intelligent transport systems such as location-based advertising, route planning, traffic management, and early warning systems. In the last few years, the widespread use of GPS navigation systems and wireless communication technology enabled vehicles has resulted in huge volumes of trajectory data. The task of utilizing this data employing spatio-temporal techniques for trajectory prediction in an efficient and accurate manner is an ongoing research problem. Existing TP approaches are limited to short-term predictions. Moreover, they cannot handle a large volume of trajectory data for long-term prediction. To address these limitations, we propose a scalable clustering and Markov chain based hybrid framework, called Traj-clusiVAT-based TP, for both short-term and long-term trajectory prediction, which can handle a large number of overlapping trajectories in a dense road network. In addition, Traj-clusiVAT can also determine the number of clusters, which represent different movement behaviours in input trajectory data. In our experiments, we compare our proposed approach with a mixed Markov model (MMM)-based scheme, and a trajectory clustering, NETSCAN-based TP method for both short- and long-term trajectory predictions. We performed our experiments on two real, vehicle trajectory datasets, including a large-scale trajectory dataset consisting of 3.28 million trajectories obtained from 15,061 taxis in Singapore over a period of one month. Experimental results on two real trajectory datasets show that our proposed approach outperforms the existing approaches in terms of both short- and long-term prediction performances, based on prediction accuracy and distance error (in km).
AI Weekly: Google's research center in Ghana won't be the last AI lab in Africa
This year, we have seen an acceleration of Silicon Valley tech giants opening AI research labs around the world as they seek to gain traction among researchers and fulfill their global ambitions. In the past six months or so, Google brought labs to China and France, Facebook opened labs in Pittsburgh and Seattle, and Microsoft announced plans to open labs near universities in Berkeley, California and Melbourne, Australia. This trend shows no signs of slowing down. Last month, Samsung announced labs in Cambridge, Moscow, and Toronto. This week, Nvidia announced plans to open a new lab in Toronto, while Google shared plans to open a lab in Accra, Ghana, Google's first in Africa and perhaps the first of any tech giant in Africa.
Happn is adding a 'creepy' map that reveals your recent movements
It may sound like a stalkers dream come true, but dating app Happn is adding a new'creepy' feature that will let potential love matches revisit your past movements. Starting next month, if you remember crossing paths with someone that took your fancy, you will be able to retrace your steps to try and find them again. If they are also a user of the popular dating app, which matches people through their device's geolocation, their profile will appear on the new map tool at that spot. Budding romantics may find the feature appealing, giving them the chance to tap locations they've visited over the past week to track down lost connections. However, some may find the idea of strangers tracing their movements more than a little creepy.
Now the Computer Can Argue With You
"Fighting technology means fighting human ingenuity," an IBM software program admonished Israeli debating champion Dan Zafrir in San Francisco Monday. The program, dubbed Project Debater, and Zafrir, were debating the value of telemedicine, but the point could also apply to the future of the technology itself. Software that processes speech and language has improved enough to do more than tell you the weather forecast. You may not be ready for machines capable of conversation or arguing, but tech companies are working to find uses for them. IBM's demo of Project Debater comes a month after Google released audio of a bot called Duplex booking restaurants and haircuts over the phone.
Improving Customer Empathy With Machine Learning
In a February 2018 interview, Liz Goli, Commissioner of Queensland's Office of State Revenue (OSR), sat back in her chair: "The machine can actually improve our empathy with our customers," she reflected. Now that's interesting – the idea that an unfeeling machine could help human beings be more empathetic towards other human beings! Late last year, OSR implemented a successful machine learning prototype, and it's moving forward with a production pilot of this emerging technology. "We don't want a system where the machine is making decisions. But we do want the machine to offer up next best-action recommendations to our staff that they have the option to follow – or not – based on their experience and knowledge of how the legislation should be applied… We'd also like a system that can ingest Big Data and take action within certain parameters. For example, in case of a natural disaster, the machine might be able to find out which customers are impacted and replace debt-collection notices with proactive letters giving additional time to pay."
Fast, Robust, and Versatile Event Detection through HMM Belief State Gradient Measures
Luo, Shuangqi, Wu, Hongmin, Lin, Hongbin, Duan, Shuangda, Guan, Yisheng, Rojas, Juan
Event detection is a critical feature in data-driven systems as it assists with the identification of nominal and anomalous behavior. Event detection is increasingly relevant in robotics as robots operate with greater autonomy in increasingly unstructured environments. In this work, we present an accurate, robust, fast, and versatile measure for skill and anomaly identification. A theoretical proof establishes the link between the derivative of the log-likelihood of the HMM filtered belief state and the latest emission probabilities. The key insight is the inverse relationship in which gradient analysis is used for skill and anomaly identification. Our measure showed better performance across all metrics than related state-of-the art works. The result is broadly applicable to domains that use HMMs for event detection.
PaMpeR: Proof Method Recommendation System for Isabelle/HOL
Deciding which sub-tool to use for a given proof state requires expertise specific to each ITP. To mitigate this problem, we present PaMpeR, a Proof Method Recommendation system for Isabelle/HOL. Given a proof state, PaMpeR recommends proof methods to discharge the proof goal and provides qualitative explanations as to why it suggests these methods. PaMpeR generates these recommendations based on existing hand-written proof corpora, thus transferring experienced users' expertise to new users. Our evaluation shows that PaMpeR correctly predicts experienced users' proof methods invocation especially when it comes to special purpose proof methods.
Neural Code Comprehension: A Learnable Representation of Code Semantics
Ben-Nun, Tal, Jakobovits, Alice Shoshana, Hoefler, Torsten
With the recent success of embeddings in natural language processing, research has been conducted into applying similar methods to code analysis. Most works attempt to process the code directly or use a syntactic tree representation, treating it like sentences written in a natural language. However, none of the existing methods are sufficient to comprehend program semantics robustly, due to structural features such as function calls, branching, and interchangeable order of statements. In this paper, we propose a novel processing technique to learn code semantics, and apply it to a variety of program analysis tasks. In particular, we stipulate that a robust distributional hypothesis of code applies to both human- and machine-generated programs. Following this hypothesis, we define an embedding space, inst2vec, based on an Intermediate Representation (IR) of the code that is independent of the source programming language. We provide a novel definition of contextual flow for this IR, leveraging both the underlying data- and control-flow of the program. We then analyze the embeddings qualitatively using analogies and clustering, and evaluate the learned representation on three different high-level tasks. We show that with a single RNN architecture and pre-trained fixed embeddings, inst2vec outperforms specialized approaches for performance prediction (compute device mapping, optimal thread coarsening); and algorithm classification from raw code (104 classes), where we set a new state-of-the-art.