AITopics | Durrett, Greg

Collaborating Authors

Durrett, Greg

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Flexible Operations for Natural Language Deduction

Bostrom, Kaj, Zhao, Xinyu, Chaudhuri, Swarat, Durrett, Greg

arXiv.org Artificial IntelligenceApr-18-2021

An interpretable system for complex, open-domain reasoning needs an interpretable meaning representation. Natural language is an excellent candidate -- it is both extremely expressive and easy for humans to understand. However, manipulating natural language statements in logically consistent ways is hard. Models have to be precise, yet robust enough to handle variation in how information is expressed. In this paper, we describe ParaPattern, a method for building models to generate logical transformations of diverse natural language inputs without direct human supervision. We use a BART-based model (Lewis et al., 2020) to generate the result of applying a particular logical operation to one or more premise statements. Crucially, we have a largely automated pipeline for scraping and constructing suitable training examples from Wikipedia, which are then paraphrased to give our models the ability to handle lexical variation. We evaluate our models using targeted contrast sets as well as out-of-domain sentence compositions from the QASC dataset (Khot et al., 2020). Our results demonstrate that our operation models are both accurate and flexible.

artificial intelligence, inductive learning, neural network, (19 more...)

arXiv.org Artificial Intelligence

2104.08825

Country:

Europe (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine > Therapeutic Area (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.54)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.46)

Add feedback

Modeling Fine-Grained Entity Types with Box Embeddings

Onoe, Yasumasa, Boratko, Michael, Durrett, Greg

arXiv.org Artificial IntelligenceJan-1-2021

Neural entity typing models typically represent entity types as vectors in a high-dimensional space, but such spaces are not well-suited to modeling these types' complex interdependencies. We study the ability of box embeddings, which represent entity types as d-dimensional hyperrectangles, to represent hierarchies of fine-grained entity type labels even when these relationships are not defined explicitly in the ontology. Our model represents both types and entity mentions as boxes. Each mention and its context are fed into a BERT-based model to embed that mention in our box space; essentially, this model leverages typological clues present in the surface text to hypothesize a type representation for the mention. Soft box containment can then be used to derive probabilities, both the posterior probability of a mention exhibiting a given type and the conditional probability relations between types themselves. We compare our approach with a strong vector-based typing model, and observe state-of-the-art performance on several entity typing benchmarks. In addition to competitive typing performance, our box-based model shows better performance in prediction consistency (predicting a supertype and a subtype together) and confidence (i.e., calibration), implying that the box-based model captures the latent type hierarchies better than the vector-based model does.

neural network, proceedings, us government, (19 more...)

arXiv.org Artificial Intelligence

2101.00345

Country:

North America > United States > California (0.14)
North America > United States > Texas (0.14)
North America > United States > Massachusetts (0.14)

Genre: Research Report (0.82)

Industry: Government > Military (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.46)

Add feedback

Multi-hop Question Answering via Reasoning Chains

Chen, Jifan, Lin, Shih-ting, Durrett, Greg

arXiv.org Artificial IntelligenceOct-7-2019

Multi-hop question answering requires models to gather information from different parts of a text to answer a question. Most current approaches learn to address this task in an end-to-end way with neural networks, without maintaining an explicit representation of the reasoning process. We propose a method to extract a discrete reasoning chain over the text, which consists of a series of sentences leading to the answer. We then feed the extracted chains to a BERT -based QA model (Devlin et al., 2018) to do final answer prediction. Critically, we do not rely on gold annotated chains or "supporting facts:" at training time, we derive pseudogold reasoning chains using heuristics based on named entity recognition and coreference resolution. Nor do we rely on these annotations at test time, as our model learns to extract chains from raw text alone. We test our approach on two recently proposed large multi-hop question answering datasets: WikiHop (Welbl et al., 2018) and HotpotQA (Y ang et al., 2018), and achieve state-of-art performance on WikiHop and strong performance on HotpotQA. Our analysis shows properties of chains that are crucial for high performance: in particular, modeling extraction sequentially is important, as is dealing with each candidate sentence in a context-aware way. Furthermore, human evaluation shows that our extracted chains allow humans to give answers with high confidence, indicating that these are a strong intermediate abstraction for this task. 1 Introduction As high performance has been achieved in simple question answering settings (Rajpurkar et al., 2016), work on question answering has increasingly gravitated towards questions that require more complex reasoning to solve.

artificial intelligence, reasoning chain, text processing, (17 more...)

arXiv.org Artificial Intelligence

1910.0261

Country: North America > United States > Texas (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.87)

Add feedback

Fine-Grained Entity Typing for Domain Independent Entity Linking

Onoe, Yasumasa, Durrett, Greg

arXiv.org Artificial IntelligenceSep-12-2019

Neural entity linking models are very powerful, but run the risk of overfitting to the domain they are trained in. For this problem, a domain can be narrowly construed as a particular distribution of entities, as models can even overfit by memorizing properties of specific frequent entities in a dataset. We tackle the problem of building robust entity linking models that generalize effectively and do not rely on labeled entity linking data with a specific entity distribution. Rather than predicting entities directly, our approach models fine-grained entity properties, which can help disambiguate between even closely related entities. We derive a large inventory of types (tens of thousands) from Wikipedia categories, and use hy-perlinked mentions in Wikipedia to distantly label data and train an entity typing model. At test time, we classify a mention with this typing model and use soft type predictions to link the mention to the most similar candidate entity. We evaluate our entity linking system on the CoNLL-Y AGO (Hoffart et al., 2011) dataset and show that our approach outperforms prior domain-independent entity linking systems. We also test our approach in a harder setting derived from the WikilinksNED dataset (Eshel et al., 2017) where all the mention-entity pairs are unseen during test time. Results indicate that our approach generalizes better than a state-of-the-art neural model on the dataset. 1 Introduction Historically, systems for entity linking to Wikipedia relied on heuristics such as anchor text distributions (Cucerzan, 2007; Milne and Witten, 2008), tf-idf (Ratinov et al., 2011), and Wikipedia relatedness of nearby entities (Hoffart et al., 2011). These systems have few parameters, making them relatively flexible across domains. More recent systems have typically been parameter-rich neural network models (Sun et al., 2015; Y amada et al., 2016; Francis-Landau et al., 2016; Eshel et al., 2017).

category, deep learning, neural network, (21 more...)

arXiv.org Artificial Intelligence

1909.0578

Country: North America > United States > Texas (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Learning to Denoise Distantly-Labeled Data for Entity Typing

Onoe, Yasumasa, Durrett, Greg

arXiv.org Artificial IntelligenceMay-4-2019

Distantly-labeled data can be used to scale up training of statistical models, but it is typically noisy and that noise can vary with the distant labeling technique. In this work, we propose a two-stage procedure for handling this type of data: denoise it with a learned model, then train our final model on clean and denoised distant data with standard supervised training. Our denoising approach consists of two parts. First, a filtering function discards examples from the distantly labeled data that are wholly unusable. Second, a relabeling function repairs noisy labels for the retained examples. Each of these components is a model trained on synthetically-noised examples generated from a small manually-labeled set. We investigate this approach on the ultra-fine entity typing task of Choi et al. (2018). Our baseline model is an extension of their model with pre-trained ELMo representations, which already achieves state-of-the-art performance. Adding distant data that has been denoised with our learned models gives further performance gains over this base model, outperforming models trained on raw distant data or heuristically-denoised distant data.

deep learning, neural network, proceedings, (21 more...)

arXiv.org Artificial Intelligence

1905.01566

Country: North America > United States > Texas (0.14)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Sports (0.94)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Understanding Dataset Design Choices for Multi-hop Reasoning

Chen, Jifan, Durrett, Greg

arXiv.org Artificial IntelligenceApr-27-2019

Learning multi-hop reasoning has been a key challenge for reading comprehension models, leading to the design of datasets that explicitly focus on it. Ideally, a model should not be able to perform well on a multi-hop question answering task without doing multi-hop reasoning. In this paper, we investigate two recently proposed datasets, WikiHop and HotpotQA. First, we explore sentence-factored models for these tasks; by design, these models cannot do multi-hop reasoning, but they are still able to solve a large number of examples in both datasets. Furthermore, we find spurious correlations in the unmasked version of WikiHop, which make it easy to achieve high performance considering only the questions and answers. Finally, we investigate one key difference between these datasets, namely span-based vs. multiple-choice formulations of the QA task. Multiple-choice versions of both datasets can be easily gamed, and two models we examine only marginally exceed a baseline in this setting. Overall, while these datasets are useful testbeds, high-performing models may not be learning as much multi-hop reasoning as previously thought.

artificial intelligence, natural language, neural network, (19 more...)

arXiv.org Artificial Intelligence

1904.12106

Country: North America > United States > Texas (0.15)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback