AITopics

1905.02662

Genre: Research Report (0.82)

Industry:

Health & Medicine > Consumer Health (1.00)
Transportation > Ground > Road (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Scripts & Frames (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Chaudhry, Arslan, Rohrbach, Marcus, Elhoseiny, Mohamed, Ajanthan, Thalaiyasingam, Dokania, Puneet K., Torr, Philip H. S., Ranzato, Marc'Aurelio

Continual Learning with Tiny Episodic Memories

arXiv.org Machine LearningMar-20-2019

Learning with less supervision is a major challenge in artificial intelligence. One sensible approach to decrease the amount of supervision is to leverage prior experience and transfer knowledge from tasks seen in the past. However, a necessary condition for a successful transfer is the ability to remember how to perform previous tasks. The Continual Learning (CL) setting, whereby an agent learns from a stream of tasks without seeing any example twice, is an ideal framework to investigate how to accrue such knowledge. In this work, we consider supervised learning tasks and methods that leverage a very small episodic memory for continual learning. Through an extensive empirical analysis across four benchmark datasets adapted to CL, we observe that a very simple baseline, which jointly trains on both examples from the current task as well as examples stored in the memory, outperforms state-of-the-art CL approaches with and without episodic memory. Surprisingly, repeated learning over tiny episodic memories does not harm generalization on past tasks, as joint training on data from subsequent tasks acts like a data dependent regularizer. We discuss and evaluate different approaches to write into the memory. Most notably, reservoir sampling works remarkably well across the board, except when the memory size is extremely small. In this case, writing strategies that guarantee an equal representation of all classes work better. Overall, these methods should be considered as a strong baseline candidate when benchmarking new CL approaches

artificial intelligence, episodic memory, machine learning, (17 more...)

1902.10486

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Consumer Health (1.00)
Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Scripts & Frames (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

AI ClassicsFeb-14-2019, 17:54:29 GMT

Readings in Medical Artificial Intelligence: The First Decade

William J. Clancey

A survey of early work exploring how AI can be used in medicine, with somewhat more technical expositions than in the complementary volume Artificial Intelligence in Medicine."Each chapter is preceded by a brief introduction that outlines our view of its contribution to the field, the reason it was selected for inclusion in this volume, an overview of its content, and a discussion of how the work evolved after the article appeared and how it relates to other chapters in the book.

diagnostic medicine, university of pittsburgh, university of wisconsin, (106 more...)

AI Classics

Country:

North America > United States > California (1.00)
North America > Canada (0.92)
Europe > United Kingdom (0.67)

Genre:

Overview (1.20)
Research Report > Experimental Study (1.00)
Research Report > New Finding (1.00)
(4 more...)

Industry:

Health & Medicine > Therapeutic Area > Internal Medicine (1.02)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.01)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.01)
(27 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.06)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.03)
Information Technology > Knowledge Management > Knowledge Engineering (1.02)
(18 more...)

arXiv.org Machine LearningDec-11-2018

Learning What to Remember: Long-term Episodic Memory Networks for Learning from Streaming Data

Jung, Hyunwoo, Han, Moonsu, Kang, Minki, Hwang, Sungju

Current generation of memory-augmented neural networks has limited scalability as they cannot efficiently process data that are too large to fit in the external memory storage. One example of this is lifelong learning scenario where the model receives unlimited length of data stream as an input which contains vast majority of uninformative entries. We tackle this problem by proposing a memory network fit for long-term lifelong learning scenario, which we refer to as Long-term Episodic Memory Networks (LEMN), that features a RNN-based retention agent that learns to replace less important memory entries based on the retention probability generated on each entry that is learned to identify data instances of generic importance relative to other memory entries, as well as its historical importance. Such learning of retention agent allows our long-term episodic memory network to retain memory entries of generic importance for a given task. We validate our model on a path-finding task as well as synthetic and real question answering tasks, on which our model achieves significant improvements over the memory augmented networks with rule-based memory scheduling as well as an RL-based baseline that does not consider relative or historical importance of the memory.

machine learning, memory cell, natural language, (17 more...)

1812.04227

Country:

Asia (0.28)
North America > United States > New York (0.14)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Consumer Health (0.81)
Education (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scripts & Frames (0.81)

Vayer, Titouan, Chapel, Laetita, Flamary, Rémi, Tavenard, Romain, Courty, Nicolas

Fused Gromov-Wasserstein distance for structured objects: theoretical foundations and mathematical properties

arXiv.org Machine LearningNov-7-2018

Optimal transport theory has recently found many applications in machine learning thanks to its capacity for comparing various machine learning objects considered as distributions. The Kantorovitch formulation, leading to the Wasserstein distance, focuses on the features of the elements of the objects but treat them independently, whereas the Gromov-Wasserstein distance focuses only on the relations between the elements, depicting the structure of the object, yet discarding its features. In this paper we propose to extend these distances in order to encode simultaneously both the feature and structure informations, resulting in the Fused Gromov-Wasserstein distance. We develop the mathematical framework for this novel distance, prove its metric and interpolation properties and provide a concentration result for the convergence of finite samples. We also illustrate and interpret its use in various contexts where structured objects are involved.

artificial intelligence, gromov-wasserstein distance, machine learning, (16 more...)

1811.02834

Country: Europe (1.00)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scripts & Frames (0.66)

Liu, Zhengzhong, Xiong, Chenyan, Mitamura, Teruko, Hovy, Eduard

Automatic Event Salience Identification

arXiv.org Artificial IntelligenceSep-3-2018

Identifying the salience (i.e. importance) of discourse units is an important task in language understanding. While events play important roles in text documents, little research exists on analyzing their saliency status. This paper empirically studies the Event Salience task and proposes two salience detection models based on content similarities and discourse relations. The first is a feature based salience model that incorporates similarities among discourse units. The second is a neural model that captures more complex relations between discourse units. Tested on our new large-scale event salience corpus, both methods significantly outperform the strong frequency baseline, while our neural model further improves the feature based one by a large margin. Our analyses demonstrate that our neural model captures interesting connections between salience and discourse unit relations (e.g., scripts and frame structures).

machine learning, natural language, relation, (19 more...)

1809.00647

Country: North America > United States (1.00)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.69)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scripts & Frames (0.48)
(2 more...)

Ma, Yunpu, Tresp, Volker, Daxberger, Erik

Embedding Models for Episodic Memory

arXiv.org Artificial IntelligenceJun-30-2018

In recent years a number of large-scale triple-oriented knowledge graphs have been generated and various models have been proposed to perform learning in those graphs. Most knowledge graphs are static and reflect the world in its current state. In reality, of course, the state of the world is changing: a healthy person becomes diagnosed with a disease and a new president is inaugurated. In this paper, we extend models for static knowledge graphs to temporal knowledge graphs. This enables us to store episodic data and to generalize to new facts (inductive learning). We generalize leading learning models for static knowledge graphs (i.e., Tucker, RESCAL, HolE, ComplEx, DistMult) to temporal knowledge graphs. In particular, we introduce a new tensor model, ConT, with superior generalization performance. The performances of all proposed models are analyzed on two different datasets: the Global Database of Events, Language, and Tone (GDELT) and the database for Integrated Conflict Early Warning System (ICEWS). We argue that temporal knowledge graph embeddings might be models also for cognitive episodic memory (facts we remember and can recollect) and that a semantic memory (current facts we know) can be generated from episodic memory by a marginalization operation. We validate this episodic-to-semantic projection hypothesis with the ICEWS dataset.

artificial intelligence, knowledge graph, machine learning, (18 more...)

1807.00228

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)
Asia > Middle East > Syria (0.04)
Asia > Middle East > Republic of Türkiye (0.04)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Consumer Health (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scripts & Frames (0.84)

arXiv.org Artificial IntelligenceMay-19-2018

Episodic Memory Deep Q-Networks

Lin, Zichuan, Zhao, Tianqi, Yang, Guangwen, Zhang, Lintao

Reinforcement learning (RL) algorithms have made huge progress in recent years by leveraging the power of deep neural networks (DNN). Despite the success, deep RL algorithms are known to be sample inefficient, often requiring many rounds of interaction with the environments to obtain satisfactory performance. Recently, episodic memory based RL has attracted attention due to its ability to latch on good actions quickly. In this paper, we present a simple yet effective biologically inspired RL algorithm called Episodic Memory Deep Q-Networks (EMDQN), which leverages episodic memory to supervise an agent during training. Experiments show that our proposed method can lead to better sample efficiency and is more likely to find good policies. It only requires 1/5 of the interactions of DQN to achieve many state-of-the-art performances on Atari games, significantly outperforming regular DQN and other episodic memory based RL algorithms.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

1805.07603

Genre: Research Report (0.50)

Industry:

Health & Medicine > Consumer Health (1.00)
Leisure & Entertainment > Games > Computer Games (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Scripts & Frames (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Lopez-Paz, David, Ranzato, Marc', Aurelio

Gradient Episodic Memory for Continual Learning

Neural Information Processing SystemsDec-31-2017

One major obstacle towards AI is the poor ability of models to solve new problems quicker, and without forgetting previously acquired knowledge. To better understand this issue, we study the problem of continual learning, where the model observes, once and one by one, examples concerning a sequence of tasks. First, we propose a set of metrics to evaluate models learning over a continuum of data. These metrics characterize models not only by their test accuracy, but also in terms of their ability to transfer knowledge across tasks. Second, we propose a model for continual learning, called Gradient Episodic Memory (GEM) that alleviates forgetting, while allowing beneficial transfer of knowledge to previous tasks. Our experiments on variants of the MNIST and CIFAR-100 datasets demonstrate the strong performance of GEM when compared to the state-of-the-art.

artificial intelligence, learning, machine learning, (16 more...)

Neural Information Processing Systems

Country: North America > United States > Texas (0.28)

Industry:

Education (0.94)
Health & Medicine > Consumer Health (0.62)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scripts & Frames (0.72)

Nagy, David G., Orbán, Gergő

Episodic memory for continual model learning

arXiv.org Machine LearningDec-4-2017

Both the human brain and artificial learning agents operating in real-world or comparably complex environments are faced with the challenge of online model selection. In principle this challenge can be overcome: hierarchical Bayesian inference provides a principled method for model selection and it converges on the same posterior for both off-line (i.e. batch) and online learning. However, maintaining a parameter posterior for each model in parallel has in general an even higher memory cost than storing the entire data set and is consequently clearly unfeasible. Alternatively, maintaining only a limited set of models in memory could limit memory requirements. However, sufficient statistics for one model will usually be insufficient for fitting a different kind of model, meaning that the agent loses information with each model change. We propose that episodic memory can circumvent the challenge of limited memory-capacity online model selection by retaining a selected subset of data points. We design a method to compute the quantities necessary for model selection even when the data is discarded and only statistics of one (or few) learnt models are available. We demonstrate on a simple model that a limited-sized episodic memory buffer, when the content is optimised to retain data with statistics not matching the current representation, can resolve the fundamental challenge of online model selection.

artificial intelligence, learner, machine learning, (17 more...)

1712.01169

Country:

Europe > Hungary (0.16)
Europe > Spain (0.14)

Genre: Research Report (0.51)

Industry: Health & Medicine > Consumer Health (0.95)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scripts & Frames (0.85)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.49)