allenai
Think you have Solved Direct-Answer Question Answering? Try ARC-DA, the Direct-Answer AI2 Reasoning Challenge
Bhakthavatsalam, Sumithra, Khashabi, Daniel, Khot, Tushar, Mishra, Bhavana Dalvi, Richardson, Kyle, Sabharwal, Ashish, Schoenick, Carissa, Tafjord, Oyvind, Clark, Peter
We present the ARC-DA dataset, a direct-answer ("open response", "freeform") version of the ARC (AI2 Reasoning Challenge) multiple-choice dataset. While ARC has been influential in the community, its multiple-choice format is unrepresentative of real-world questions, and multiple choice formats can be particularly susceptible to artifacts. The ARC-DA dataset addresses these concerns by converting questions to direct-answer format using a combination of crowdsourcing and expert review. The resulting dataset contains 2985 questions with a total of 8436 valid answers (questions typically have more than one valid answer). ARC-DA is one of the first DA datasets of natural questions that often require reasoning, and where appropriate question decompositions are not evident from the questions themselves. We describe the conversion approach taken, appropriate evaluation metrics, and several strong models. Although high, the best scores (81% GENIE, 61.4% F1, 63.2% ROUGE-L) still leave considerable room for improvement. In addition, the dataset provides a natural setting for new research on explanation, as many questions require reasoning to construct answers. We hope the dataset spurs further advances in complex question-answering by the community. ARC-DA is available at https://allenai.org/data/arc-da
- North America > United States > Washington > King County > Seattle (0.04)
- North America > United States > Texas > Sterling County (0.04)
- North America > United States > New York (0.04)
GenericsKB: A Knowledge Base of Generic Statements
Bhakthavatsalam, Sumithra, Anastasiades, Chloe, Clark, Peter
We present a new resource for the NLP community, namely a large (3.5M+ sentence) knowledge base of *generic statements*, e.g., "Trees remove carbon dioxide from the atmosphere", collected from multiple corpora. This is the first large resource to contain *naturally occurring* generic sentences, as opposed to extracted or crowdsourced triples, and thus is rich in high-quality, general, semantically complete statements. All GenericsKB sentences are annotated with their topical term, surrounding context (sentences), and a (learned) confidence. We also release GenericsKB-Best (1M+ sentences), containing the best-quality generics in GenericsKB augmented with selected, synthesized generics from WordNet and ConceptNet. In tests on two existing datasets requiring multihop reasoning (OBQA and QASC), we find using GenericsKB can result in higher scores and better explanations than using a much larger corpus. This demonstrates that GenericsKB can be a useful resource for NLP applications, as well as providing data for linguistic studies of generics and their semantics. GenericsKB is available at https://allenai.org/data/genericskb.
- North America > United States > Washington > King County > Seattle (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
Artificial intelligence plays on your team in Iconary, a picture puzzle game from AI2
This special series explores the evolving relationship between humans and machines, examining the ways that robots, artificial intelligence and automation are impacting our work and lives. For decades, the games that put artificial intelligence to the test have been played human vs. machine – whether it's checkers, chess, Go, poker, StarCraft or "Jeopardy." Why isn't there a game where the AI and the human are on the same side? Now there is, and you can play, too. Researchers at Seattle's Allen Institute for Artificial Intelligence are taking the wraps off Iconary, a Pictionary-type puzzle game in which an AI and human players take turns putting together pictures and guessing what phrases the pictures signify.
Pictionary-Playing AI Sketches the Future of Human-Machine Collaborations
What do the games of chess, Jeopardy!, Go, Texas Hold'em, and StarCraft have in common? In each of these competitive arenas, an AI has resoundingly beat the best human players in the world. These victories are astounding feats of artificial intelligence--yet they've become almost humdrum. At the Allen Institute for Artificial Intelligence (AI2), in Seattle, researchers set out to do something different. Their AllenAI collaborates with a human player in a Pictionary-style drawing and guessing game, which is won through human-AI cooperation.
- North America > United States > Texas (0.25)
- North America > United States > California > San Francisco County > San Francisco (0.05)
- Leisure & Entertainment > Games > Computer Games (0.70)
- Leisure & Entertainment > Games > Chess (0.56)
Artificial intelligence learns 'deep thoughts' by playing Pictionary
Scientists are using the popular drawing game Pictionary to teach artificial intelligence common sense. AI researchers at the Allen Institute for Artificial Intelligence (AI2), a non-profit lab in Seattle, developed a version of the game called Iconary in order to teach its AllenAI artificial intelligence abstract concepts from pictures alone. Iconary was made public on 5 February in order to encourage people to play the game with AllenAI. By learning from humans, the researchers hope AllenAI will continue to develop common sense reasoning. "Iconary is one of the first times an AI system is paired in a collaborative game with a human player instead of antagonistically working against them," the Iconary website states.
- Government > Military (0.86)
- Government > Regional Government > North America Government > United States Government (0.84)
- Leisure & Entertainment > Games > Chess (0.57)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (0.56)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.40)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)
Your Next Game Night Partner? A Computer
When the arrow appeared next to the birdcage, I finally understood what my partner was trying to say. The game was a clone of Pictionary--I had to guess the phrase based on a drawing. My partner had initially depicted a duck next to a cage, plus a hand, and a pond. Only after I asked for another drawing and the arrow was added did I realize the hand was "releasing" the duck, not feeding it. "You win!!!" I was told, after typing in the full answer.
Moving Beyond the Turing Test with the Allen AI Science Challenge
The field of artificial intelligence has made great strides recently, as in AlphaGo's victories in the game of Go over world champion South Korean Lee Sedol in March 2016 and top-ranked Chinese Go player Ke Jie in May 2017, leading to great optimism for the field. But are we really moving toward smarter machines, or are these successes restricted to certain classes of problems, leaving others untouched? In 2015, the Allen Institute for Artificial Intelligence (AI2) ran its first Allen AI Science Challenge, a competition to test machines on an ostensibly difficult task--answering eighth-grade science questions. Our motivations were to encourage the field to set its sights more broadly by exploring a problem that appears to require modeling, reasoning, language understanding, and commonsense knowledge in order to probe the state of the art while sowing the seeds for possible future breakthroughs. Challenge problems have historically played an important role in motivating and driving progress in research.
- Asia > South Korea (0.24)
- North America > United States > Washington > King County > Seattle (0.06)
- North America > United States > New York (0.05)
- (7 more...)
- Leisure & Entertainment > Games > Go (0.88)
- Education > Educational Setting > K-12 Education (0.56)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Issues > Turing's Test (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.46)
My Computer Is an Honor Student — but How Intelligent Is It? Standardized Tests as a Measure of AI
Clark, Peter (Allen Institute for Artificial Intelligence) | Etzioni, Oren (Allen Institute for Artificial Intelligence)
Given the well-known limitations of the Turing Test, there is a need for objective tests to both focus attention on, and measure progress towards, the goals of AI. In this paper we argue that machine performance on standardized tests should be a key component of any new measure of AI, because attaining a high level of performance requires solving significant AI problems involving language understanding and world modeling - critical skills for any machine that lays claim to intelligence. In addition, standardized tests have all the basic requirements of a practical test: they are accessible, easily comprehensible, clearly measurable, and offer a graduated progression from simple tasks to those requiring deep understanding of the world. Here we propose this task as a challenge problem for the community, summarize our state-of-the-art results on math and science tests, and provide supporting datasets
- North America > United States > New York > Tompkins County > Ithaca (0.04)
- North America > United States > California > San Francisco County > San Francisco (0.04)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- (5 more...)