Continual Reinforcement Learning in 3D Non-stationary Environments Machine Learning

High-dimensional always-changing environments constitute a hard challenge for current reinforcement learning techniques. Artificial agents, nowadays, are often trained off-line in very static and controlled conditions in simulation such that training observations can be thought as sampled i.i.d. from the entire observations space. However, in real world settings, the environment is often non-stationary and subject to unpredictable, frequent changes. In this paper we propose and openly release CRLMaze, a new benchmark for learning continually through reinforcement in a complex 3D non-stationary task based on ViZDoom and subject to several environmental changes. Then, we introduce an end-to-end model-free continual reinforcement learning strategy showing competitive results with respect to four different baselines and not requiring any access to additional supervised signals, previously encountered environmental conditions or observations.

Q&A: Travel startup paves way for industry consolidation (Includes interview)


In May 2019 Google announced the consolidation of all its travel features. Google Maps, Trips, Hotels and Flights will combine to make one Google Travel, easing the process for vacation planning. Travel startup VacationRenter, which launched last year, pioneered this model for vacation rentals, based on an artificial intelligence driven platform. According to VacationRenter's newly appointed COO, ex-Googler Marco del Rosario, both Google Travel and VacationRenter are early adopters of a pivotal strategy for today's travel technology: consolidation. Digital Journal: How has the world of travel changed in recent years?

Complementary Learning for Overcoming Catastrophic Forgetting Using Experience Replay Machine Learning

Despite huge success, deep networks are unable to learn effectively in sequential multitask learning settings as they forget the past learned tasks after learning new tasks. Inspired from complementary learning systems theory, we address this challenge by learning a generative model that couples the current task to the past learned tasks through a discriminative embedding space. We learn an abstract level generative distribution in the embedding that allows the generation of data points to represent the experience. We sample from this distribution and utilize experience replay to avoid forgetting and simultaneously accumulate new knowledge to the abstract distribution in order to couple the current task with past experience. We demonstrate theoretically and empirically that our framework learns a distribution in the embedding that is shared across all task and as a result tackles catastrophic forgetting.

Attention-Based Structural-Plasticity Machine Learning

Catastrophic forgetting/interference is a critical problem for lifelong learning machines, which impedes the agents from maintaining their previously learned knowledge while learning new tasks. Neural networks, in particular, suffer plenty from the catastrophic forgetting phenomenon. Recently there has been several efforts towards overcoming catastrophic forgetting in neural networks. Here, we propose a biologically inspired method toward overcoming catastrophic forgetting. Specifically, we define an attention-based selective plasticity of synapses based on the cholinergic neuromodulatory system in the brain. We define synaptic importance parameters in addition to synaptic weights and then use Hebbian learning in parallel with backpropagation algorithm to learn synaptic importances in an online and seamless manner. We test our proposed method on benchmark tasks including the Permuted MNIST and the Split MNIST problems and show competitive performance compared to the state-of-the-art methods.

Generative Memory for Lifelong Reinforcement Learning Artificial Intelligence

Our research is focused on understanding and applying biological memory transfers to new AI systems that can fundamentally improve their performance, throughout their fielded lifetime experience. We leverage current understanding of biological memory transfer to arrive at AI algorithms for memory consolidation and replay. In this paper, we propose the use of generative memory that can be recalled in batch samples to train a multi-task agent in a pseudo-rehearsal manner. We show results motivating the need for task-agnostic separation of latent space for the generative memory to address issues of catastrophic forgetting in lifelong learning.

Policy Consolidation for Continual Reinforcement Learning Machine Learning

We propose a method for tackling catastrophic forgetting in deep reinforcement learning that is \textit{agnostic} to the timescale of changes in the distribution of experiences, does not require knowledge of task boundaries, and can adapt in \textit{continuously} changing environments. In our \textit{policy consolidation} model, the policy network interacts with a cascade of hidden networks that simultaneously remember the agent's policy at a range of timescales and regularise the current policy by its own history, thereby improving its ability to learn without forgetting. We find that the model improves continual learning relative to baselines on a number of continuous control tasks in single-task, alternating two-task, and multi-agent competitive self-play settings.

CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison Artificial Intelligence

Large, labeled datasets have driven deep learning methods to achieve expert-level performance on a variety of medical imaging tasks. We present CheXpert, a large dataset that contains 224,316 chest radiographs of 65,240 patients. We design a labeler to automatically detect the presence of 14 observations in radiology reports, capturing uncertainties inherent in radiograph interpretation. We investigate different approaches to using the uncertainty labels for training convolutional neural networks that output the probability of these observations given the available frontal and lateral radiographs. On a validation set of 200 chest radiographic studies which were manually annotated by 3 board-certified radiologists, we find that different uncertainty approaches are useful for different pathologies. We then evaluate our best model on a test set composed of 500 chest radiographic studies annotated by a consensus of 5 board-certified radiologists, and compare the performance of our model to that of 3 additional radiologists in the detection of 5 selected pathologies. On Cardiomegaly, Edema, and Pleural Effusion, the model ROC and PR curves lie above all 3 radiologist operating points. We release the dataset to the public as a standard benchmark to evaluate performance of chest radiograph interpretation models. The dataset is freely available at .

Hybrid Retrieval-Generation Reinforced Agent for Medical Image Report Generation

Neural Information Processing Systems

Generating long and coherent reports to describe medical images poses challenges to bridging visual patterns with informative human linguistic descriptions. We propose a novel Hybrid Retrieval-Generation Reinforced Agent (HRGR-Agent) which reconciles traditional retrieval-based approaches populated with human prior knowledge, with modern learning-based approaches to achieve structured, robust, and diverse report generation. HRGR-Agent employs a hierarchical decision-making procedure. For each sentence, a high-level retrieval policy module chooses to either retrieve a template sentence from an off-the-shelf template database, or invoke a low-level generation module to generate a new sentence. HRGR-Agent is updated via reinforcement learning, guided by sentence-level and word-level rewards. Experiments show that our approach achieves the state-of-the-art results on two medical report datasets, generating well-balanced structured sentences with robust coverage of heterogeneous medical report contents. In addition, our model achieves the highest detection precision of medical abnormality terminologies, and improved human evaluation performance.

'AI, ML, cobots can help firms streamline their supply chains'


New Delhi: Vivekanand is country manager, India and the South Asian Association for Regional Cooperation (Saarc) of Singapore-based GreyOrange, a company that designs, manufactures and deploys advanced robotics systems for automation in warehouses. GreyOrange's marquee solutions are the Butler (goods-to-person for inventory storage, picking and consolidation) and Linear Sorter (modular system for automating sorting processes in fulfilment centres). In an interview last week, Vivekanand spoke about how robotics can help companies improve their supply chain, among other things. How is robotics transforming warehouse processes and efficiency? With the ongoing e-commerce boom, customer expectations are at an all-time high, and supply chains need to be faster and more efficient.

What to Expect from IoT Platforms in 2018


The evolution of computing and cost efficiency has made commercial devices capable of running full-on operating systems and complex algorithms, right in the office. IoT platforms in 2018 are continuing to push for the fastest connectivity. That's of course where the concept of Edge Computing comes in, where workload is processed on the edge of the network where the IoT connects the Cloud with the physical world. A key part of this progression is the fast and effective integration between IoT and the Cloud, locating many of the processes onboard the devices themselves and connecting them with the Cloud for the most essential functions. As machine learning algorithms evolve and advance, there are a few things we can expect.