AITopics | Talmor, Alon

Collaborating Authors

Talmor, Alon

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

CommonsenseQA 2.0: Exposing the Limits of AI through Gamification

Talmor, Alon, Yoran, Ori, Bras, Ronan Le, Bhagavatula, Chandra, Goldberg, Yoav, Choi, Yejin, Berant, Jonathan

arXiv.org Artificial IntelligenceJan-14-2022

Constructing benchmarks that test the abilities of modern natural language understanding models is difficult - pre-trained language models exploit artifacts in benchmarks to achieve human parity, but still fail on adversarial examples and make errors that demonstrate a lack of common sense. In this work, we propose gamification as a framework for data construction. The goal of players in the game is to compose questions that mislead a rival AI while using specific phrases for extra points. The game environment leads to enhanced user engagement and simultaneously gives the game designer control over the collected data, allowing us to collect high-quality data at scale. Using our method we create CommonsenseQA 2.0, which includes 14,343 yes/no questions, and demonstrate its difficulty for models that are orders-of-magnitude larger than the AI used in the game itself. Our best baseline, the T5-based Unicorn with 11B parameters achieves an accuracy of 70.2%, substantially higher than GPT-3 (52.9%) in a few-shot inference setup. Both score well below human performance which is at 94.1%.

artificial intelligence, computational linguistic, natural language, (19 more...)

arXiv.org Artificial Intelligence

2201.0532

Country:

Europe (1.00)
Asia (0.93)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (0.46)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.36)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.36)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.36)

Add feedback

MultiModalQA: Complex Question Answering over Text, Tables and Images

Talmor, Alon, Yoran, Ori, Catav, Amnon, Lahav, Dan, Wang, Yizhong, Asai, Akari, Ilharco, Gabriel, Hajishirzi, Hannaneh, Berant, Jonathan

arXiv.org Artificial IntelligenceApr-13-2021

When answering complex questions, people can seamlessly combine information from visual, textual and tabular sources. While interest in models that reason over multiple pieces of evidence has surged in recent years, there has been relatively little work on question answering models that reason across multiple modalities. QA (MMQA): a challenging question answering dataset that requires joint reasoning over text, tables and images. We create MMQA using a new framework for generating complex multi-modal questions at scale, harvesting tables from Wikipedia, and attaching images and text paragraphs using entities that appear in each table. We then define a formal language that allows us to take questions that can be answered from a single modality, and combine them to generate cross-modal questions. Last, crowdsourcing workers take these automatically generated questions and rephrase them into more fluent language. When presented with complex questions, people often do not know in advance what source(s) of information are relevant for answering it. In general scenarios, these sources can encompass multiple modalities, be it paragraphs of text, structured tables, images or combinations of those. For instance, a user might ponder "When was the famous painting with two touching fingers completed?", Answering this question is made possible by integrating information across both the textual and visual modalities. Recently, there has been substantial interest in question answering (QA) models that reason over multiple pieces of evidence (multi-hop questions (Yang et al., 2018; Talmor & Berant, 2018; Welbl et al., 2017)). In most prior work, the question is phrased in natural language and the answer is in a context, which may be a paragraph (Rajpurkar, 2016), a table (Pasupat & Liang, 2015), or an image (Antol et al., 2015). However, there has been relatively little work on answering questions that require integrating information across modalities.

artificial intelligence, natural language, wikientity, (19 more...)

arXiv.org Artificial Intelligence

2104.06039

Country: North America > United States (1.00)

Genre: Research Report (0.40)

Industry:

Media (0.68)
Leisure & Entertainment > Sports (0.46)
Government > Regional Government (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)

Add feedback

Teaching Pre-Trained Models to Systematically Reason Over Implicit Knowledge

Talmor, Alon, Tafjord, Oyvind, Clark, Peter, Goldberg, Yoav, Berant, Jonathan

arXiv.org Artificial IntelligenceJun-19-2020

Evidence suggests that large pre-trained language models (LMs) acquire some reasoning capacity, but this ability is difficult to control. Recently, it has been shown that Transformer-based models succeed in consistent reasoning over explicit symbolic facts, under a "closed-world" assumption. However, in an open-domain setup, it is desirable to tap into the vast reservoir of implicit knowledge already encoded in the parameters of pre-trained LMs. In this work, we provide a first demonstration that LMs can be trained to reliably perform systematic reasoning combining both implicit, pre-trained knowledge and explicit natural language statements. To do this, we describe a procedure for automatically generating datasets that teach a model new reasoning skills, and demonstrate that models learn to effectively perform inference which involves implicit taxonomic and world knowledge, chaining and counting. Finally, we show that "teaching" the models to reason generalizes beyond the training distribution: they successfully compose the usage of multiple reasoning skills in single examples. Our work paves a path towards open-domain systems that constantly improve by interacting with users who can instantly correct a model by adding simple natural language statements.

deep learning, explicit knowledge, neural network, (18 more...)

arXiv.org Artificial Intelligence

2006.06609

Genre: Research Report (0.82)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

MultiQA: An Empirical Investigation of Generalization and Transfer in Reading Comprehension

Talmor, Alon, Berant, Jonathan

arXiv.org Artificial IntelligenceMay-31-2019

A large number of reading comprehension (RC) datasets has been created recently, but little analysis has been done on whether they generalize to one another, and the extent to which existing datasets can be leveraged for improving performance on new ones. In this paper, we conduct such an investigation over ten RC datasets, training on one or more source RC datasets, and evaluating generalization, as well as transfer to a target RC dataset. We analyze the factors that contribute to generalization, and show that training on a source RC dataset and transferring to a target dataset substantially improves performance, even in the presence of powerful contextual representations from BERT (Devlin et al., 2019). We also find that training on multiple source RC datasets leads to robust generalization and transfer, and can reduce the cost of example collection for a new RC dataset. Following our analysis, we propose MultiQA, a BERT-based model, trained on multiple RC datasets, which leads to state-of-the-art performance on five RC datasets. We share our infrastructure for the benefit of the research community.

artificial intelligence, dataset, evolutionary algorithm, (20 more...)

arXiv.org Artificial Intelligence

1905.13453

Country: Asia > Middle East > Israel (0.14)

Genre: Research Report (1.00)

Industry: Education > Assessment & Standards > Student Performance (0.62)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.46)

Add feedback

CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge

Talmor, Alon, Herzig, Jonathan, Lourie, Nicholas, Berant, Jonathan

arXiv.org Artificial IntelligenceNov-2-2018

When answering a question, people often draw upon their rich world knowledge in addition to some task-specific context. Recent work has focused primarily on answering questions based on some relevant document or content, and required very little general background. To investigate question answering with prior knowledge, we present CommonsenseQA: a difficult new dataset for commonsense question answering. To capture common sense beyond associations, each question discriminates between three target concepts that all share the same relationship to a single source drawn from ConceptNet (Speer et al., 2017). This constraint encourages crowd workers to author multiple-choice questions with complex semantics, in which all candidates relate to the subject in a similar way. We create 9,500 questions through this procedure and demonstrate the dataset's difficulty with a large number of strong baselines. Our best baseline, the OpenAI GPT (Radford et al., 2018), obtains 54.8% accuracy, well below human performance, which is 95.3%.

dataset, deep learning, neural network, (20 more...)

arXiv.org Artificial Intelligence

1811.00937

Genre: Research Report (0.50)

Industry: Education (0.89)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Repartitioning of the ComplexWebQuestions Dataset

Talmor, Alon, Berant, Jonathan

arXiv.org Artificial IntelligenceJul-25-2018

Recently, Talmor and Berant (2018) introduced ComplexWebQuestions - a dataset focused on answering complex questions by decomposing them into a sequence of simpler questions and extracting the answer from retrieved web snippets. In their work the authors used a pre-trained reading comprehension (RC) model (Salant and Berant, 2018) to extract the answer from the web snippets. In this short note we show that training a RC model directly on the training data of ComplexWebQuestions reveals a leakage from the training set to the test set that allows to obtain unreasonably high performance. As a solution, we construct a new partitioning of ComplexWebQuestions that does not suffer from this leakage and publicly release it. We also perform an empirical evaluation on these two datasets and show that training a RC model on the training data substantially improves state-of-the-art performance.

artificial intelligence, omplexwebquestion, survey article, (19 more...)

arXiv.org Artificial Intelligence

1807.09623

Country: Asia > Pakistan (0.51)

Genre: Research Report (0.40)

Industry:

Media > Film (0.50)
Leisure & Entertainment (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.91)

Add feedback