Not enough data to create a plot.
Try a different view from the menu above.
Mitchell, Tom
Conversational Neuro-Symbolic Commonsense Reasoning
Arabshahi, Forough, Lee, Jennifer, Gawarecki, Mikayla, Mazaitis, Kathryn, Azaria, Amos, Mitchell, Tom
One aspect of human commonsense reasoning is the ability to make presumptions about daily experiences, activities and social interactions with others. We propose a new commonsense reasoning benchmark where the task is to uncover commonsense presumptions implied by imprecisely stated natural language commands in the form of if-then-because statements. For example, in the command "If it snows at night then wake me up early because I don't want to be late for work" the speaker relies on commonsense reasoning of the listener to infer the implicit presumption that it must snow enough to cause traffic slowdowns. Such if-then-because commands are particularly important when users instruct conversational agents. We release a benchmark data set for this task, collected from humans and annotated with commonsense presumptions. We develop a neuro-symbolic theorem prover that extracts multi-hop reasoning chains and apply it to this problem. We further develop an interactive conversational framework that evokes commonsense knowledge from humans for completing reasoning chains.
Jelly Bean World: A Testbed for Never-Ending Learning
Platanios, Emmanouil Antonios, Saparov, Abulhair, Mitchell, Tom
Machine learning has shown growing success in recent years. However, current machine learning systems are highly specialized, trained for particular problems or domains, and typically on a single narrow dataset. Human learning, on the other hand, is highly general and adaptable. Never-ending learning is a machine learning paradigm that aims to bridge this gap, with the goal of encouraging researchers to design machine learning systems that can learn to perform a wider variety of inter-related tasks in more complex environments. To date, there is no environment or testbed to facilitate the development and evaluation of never-ending learning systems. To this end, we propose the Jelly Bean World testbed. The Jelly Bean World allows experimentation over two-dimensional grid worlds which are filled with items and in which agents can navigate. This testbed provides environments that are sufficiently complex and where more generally intelligent algorithms ought to perform better than current state-of-the-art reinforcement learning approaches. It does so by producing non-stationary environments and facilitating experimentation with multi-task, multi-agent, multi-modal, and curriculum learning settings. We hope that this new freely-available software will prompt new research and interest in the development and evaluation of never-ending learning systems and more broadly, general intelligence systems.
Learning Data Manipulation for Augmentation and Weighting
Hu, Zhiting, Tan, Bowen, Salakhutdinov, Ruslan, Mitchell, Tom, Xing, Eric P.
Manipulating data, such as weighting data examples or augmenting with new instances, has been increasingly used to improve model training. Previous work has studied various rule- or learning-based approaches designed for specific types of data manipulation. In this work, we propose a new method that supports learning different manipulation schemes with the same gradient-based algorithm. Our approach builds upon a recent connection of supervised learning and reinforcement learning (RL), and adapts an off-the-shelf reward learning algorithm from RL for joint data manipulation learning and model training. Different parameterization of the "data reward" function instantiates different manipulation schemes. We showcase data augmentation that learns a text transformation network, and data weighting that dynamically adapts the data sample importance. Experiments show the resulting algorithms significantly improve the image and text classification performance in low data regime and class-imbalance problems.
Leveraging Knowledge Bases in LSTMs for Improving Machine Reading
Yang, Bishan, Mitchell, Tom
This paper focuses on how to take advantage of external knowledge bases (KBs) to improve recurrent neural networks for machine reading. Traditional methods that exploit knowledge from KBs encode knowledge as discrete indicator features. Not only do these features generalize poorly, but they require task-specific feature engineering to achieve good performance. We propose KBLSTM, a novel neural model that leverages continuous representations of KBs to enhance the learning of recurrent neural networks for machine reading. To effectively integrate background knowledge with information from the currently processed text, our model employs an attention mechanism with a sentinel to adaptively decide whether to attend to background knowledge and which information from KBs is useful. Experimental results show that our model achieves accuracies that surpass the previous state-of-the-art results for both entity extraction and event extraction on the widely used ACE2005 dataset.
Contextual Parameter Generation for Universal Neural Machine Translation
Platanios, Emmanouil Antonios, Sachan, Mrinmaya, Neubig, Graham, Mitchell, Tom
We propose a simple modification to existing neural machine translation (NMT) models that enables using a single universal model to translate between multiple languages while allowing for language specific parameterization, and that can also be used for domain adaptation. Our approach requires no changes to the model architecture of a standard NMT system, but instead introduces a new component, the contextual parameter generator (CPG), that generates the parameters of the system (e.g., weights in a neural network). This parameter generator accepts source and target language embeddings as input, and generates the parameters for the encoder and the decoder, respectively. The rest of the model remains unchanged and is shared across all languages. We show how this simple modification enables the system to use monolingual data for training and also perform zero-shot translation. We further show it is able to surpass state-of-the-art performance for both the IWSLT-15 and IWSLT-17 datasets and that the learned language embeddings are able to uncover interesting relationships between languages.
Inferring Interpersonal Relations in Narrative Summaries
Srivastava, Shashank (Carnegie Mellon University) | Chaturvedi, Snigdha (University of Maryland, College Park) | Mitchell, Tom (Carnegie Mellon University)
Characterizing relationships between people is fundamental for the understanding of narratives. In this work, we address the problem of inferring the polarity of relationships between people in narrative summaries. We formulate the problem as a joint structured prediction for each narrative, and present a general model that combines evidence from linguistic and semantic features, as well as features based on the structure of the social community in the text. We additionally provide a clustering-based approach that can exploit regularities in narrative types. e.g., learn an affinity for love-triangles in romantic stories. On a dataset of movie summaries from Wikipedia, our structured models provide more than 30% error-reduction over a competitive baseline that considers pairs of characters in isolation.
Combining Vector Space Embeddings with Symbolic Logical Inference over Open-Domain Text
Gardner, Matt (Carnegie Mellon University) | Talukdar, Partha (Indian Institute of Science) | Mitchell, Tom (Carnegie Mellon University)
We have recently shown how to combine random walk inference over knowledge bases with vector space representations of surface forms, improving performance on knowledge base inference. In this paper, we formalize the connection of our prior work to logical inference rules, giving some general observations about methods for incorporating vector space representations into symbolic logic systems. Additionally, we present some promising preliminary work that extends these techniques to learning open-domain relations for the purpose of answering multiple choice questions, achieving 67% accuracy on a small test set.
The 2005 AAAI Classic Paper Awards
Mitchell, Tom, Levesque, Hector
Mitchell and Levesque provide commentary on the two AAAI Classic Paper awards, given at the AAAI-05 conference in Pittsburgh, Pennsylvania. The two winning papers were "Quantifying the Inductive Bias in Concept Learning," by David Haussler, and "Default Reasoning, Nonmonotonic Logics, and the Frame Problem," by Steve Hanks and Drew McDermott.
The 2005 AAAI Classic Paper Awards
Mitchell, Tom, Levesque, Hector
Twenty years later that link is firmly established, and the two research communities have largely merged into one. Problem," by Steve Hanks and Drew Mc-or does not hold after a sequence of learning was the "inductive bias" of a The idea is this: Normally, an object learn the target concept--the more in 1986, helped initiate a very constraining the inductive bias, the is unaffected by an action. If a window fruitful integration of a branch of machine less training data needed. Starting in the of PAC learning was being developed, an action. There are clear exceptions, 1950s, with work like Samuels's program which allowed deriving quantitative however, such as the act of closing the that learned strategies for playing bounds on the probability of window. A variety of formal systems checkers, AI researchers had designed successful learning as a function of the have been proposed that would allow and experimented with a number of training examples and the us to infer in the absence of conflicting variety of learning algorithms and complexity of the learner's hypothesis information that the window remains had also developed a number of theoretical space (as measured by its Vapnik-open (or that a polar bear is results, such as convergence Chervonenkis dimension). What white or that a violin has four strings, proofs for perceptrons and "learning Haussler's paper did was help introduce and so on).
In Memoriam: Charles Rosen, Norman Nielsen, and Saul Amarel
Hart, Peter E., Nilsson, Nils J., Perrault, Ray, Mitchell, Tom, Kulikowski, Casimir A., Leake, David B.
In the span of a few months, the AI community lost four important figures. The fall of 2002 marked the passing of Ray Reiter, for whom a memorial article by Jack Minker appears in this issue. As the issue was going to press, AI lost Saul Amarel, Norm Nielsen, and Charles Rosen. This section of AI Magazine commemorates these friends, leaders, and AI pioneers. We thank Tom Mitchell and Casimir Kulikowski for their memorial to Saul Amarel, Ray Perrault for his remembrance of Norm Nielsen, and Peter Hart and Nils Nilsson for their tribute to Charles Rosen. The AI community mourns our lost colleagues and gratefully remembers their contributions, which meant so much to so many and to the advancement of artificial intelligence as a whole.