AITopics

2502.10934

Country:

North America > United States (1.00)
Europe (1.00)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceJul-31-2023

Getting from Generative AI to Trustworthy AI: What LLMs might learn from Cyc

Lenat, Doug, Marcus, Gary

Generative AI, the most popular current approach to AI, consists of large language models (LLMs) that are trained to produce outputs that are plausible, but not necessarily correct. Although their abilities are often uncanny, they are lacking in aspects of reasoning, leading LLMs to be less than completely trustworthy. Furthermore, their results tend to be both unpredictable and uninterpretable. We lay out 16 desiderata for future AI, and discuss an alternative approach to AI which could theoretically address many of the limitations associated with current approaches: AI educated with curated pieces of explicit knowledge and rules of thumb, enabling an inference engine to automatically deduce the logical entailments of all that knowledge. Even long arguments produced this way can be both trustworthy and interpretable, since the full step-by-step line of reasoning is always available, and for each step the provenance of the knowledge used can be documented and audited. There is however a catch: if the logical language is expressive enough to fully represent the meaning of anything we can say in English, then the inference engine runs much too slowly. That's why symbolic AI systems typically settle for some fast but much less expressive logic, such as knowledge graphs. We describe how one AI system, Cyc, has developed ways to overcome that tradeoff and is able to reason in higher order logic in real time. We suggest that any trustworthy general AI will need to hybridize the approaches, the LLM approach and more formal approach, and lay out a path to realizing that dream.

logic & formal reasoning, machine learning, natural language, (20 more...)

2308.04445

Country:

North America > United States (0.67)
Asia (0.67)
Europe (0.46)

Genre: Research Report (1.00)

Industry:

Government > Regional Government (0.68)
Leisure & Entertainment (0.67)
Health & Medicine > Therapeutic Area (0.67)

Technology:

Information Technology > Artificial Intelligence > Systems & Languages > Problem-Independent Architectures (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
(4 more...)

arXiv.org Artificial IntelligenceJul-26-2023

A Sentence is Worth a Thousand Pictures: Can Large Language Models Understand Human Language?

Marcus, Gary, Leivada, Evelina, Murphy, Elliot

Artificial Intelligence applications show great potential for language-related tasks that rely on next-word prediction. The current generation of large language models have been linked to claims about human-like linguistic performance and their applications are hailed both as a key step towards Artificial General Intelligence and as major advance in understanding the cognitive, and even neural basis of human language. We analyze the contribution of large language models as theoretically informative representations of a target system vs. atheoretical powerful mechanistic tools, and we identify the key abilities that are still missing from the current state of development and exploitation of these models.

artificial intelligence, machine learning, natural language, (16 more...)

2308.00109

Country: North America > United States (0.48)

Genre: Research Report (1.00)

Industry:

Health & Medicine (0.47)
Media > News (0.30)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

arXiv.org Artificial IntelligenceFeb-27-2023

Testing AI performance on less frequent aspects of language reveals insensitivity to underlying meaning

Dentella, Vittoria, Murphy, Elliot, Marcus, Gary, Leivada, Evelina

Advances in computational methods and big data availability have recently translated into breakthroughs in AI applications. With successes in bottom-up challenges partially overshadowing shortcomings, the 'human-like' performance of Large Language Models has raised the question of how linguistic performance is achieved by algorithms. Given systematic shortcomings in generalization across many AI systems, in this work we ask whether linguistic performance is indeed guided by language knowledge in Large Language Models. To this end, we prompt GPT-3 with a grammaticality judgement task and comprehension questions on less frequent constructions that are thus unlikely to form part of Large Language Models' training data. These included grammatical 'illusions', semantic anomalies, complex nested hierarchies and self-embeddings. GPT-3 failed for every prompt but one, often offering answers that show a critical lack of understanding even of high-frequency words used in these less frequent grammatical constructions. The present work sheds light on the boundaries of the alleged AI human-like linguistic competence and argues that, far from human-like, the next-word prediction abilities of LLMs may face issues of robustness, when pushed beyond training data.

grammatically correct, machine learning, natural language, (15 more...)

2302.12313

Country:

Europe (1.00)
North America > United States > California > Los Angeles County (0.28)

Genre: Research Report > New Finding (0.68)

Industry: Transportation > Air (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceJan-23-2023

The Defeat of the Winograd Schema Challenge

Kocijan, Vid, Davis, Ernest, Lukasiewicz, Thomas, Marcus, Gary, Morgenstern, Leora

The Winograd Schema Challenge - a set of twin sentences involving pronoun reference disambiguation that seem to require the use of commonsense knowledge - was proposed by Hector Levesque in 2011. By 2019, a number of AI systems, based on large pre-trained transformer-based language models and fine-tuned on these kinds of problems, achieved better than 90% accuracy. In this paper, we review the history of the Winograd Schema Challenge and discuss the lasting contributions of the flurry of research that has taken place on the WSC in the last decade. We discuss the significance of various datasets developed for WSC, and the research community's deeper understanding of the role of surrogate tasks in assessing the intelligence of an AI system.

artificial intelligence, machine learning, survey article, (16 more...)

2201.02387

Country:

North America > United States (1.00)
Europe (1.00)

Genre:

Overview (1.00)
Research Report (0.81)

Industry:

Education (0.93)
Government > Regional Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningJan-2-2018

Deep Learning: A Critical Appraisal

Marcus, Gary

Although deep learning has historical roots going back decades, neither the term "deep learning" nor the approach was popular just over five years ago, when the field was reignited by papers such as Krizhevsky, Sutskever and Hinton's now classic (2012) deep network model of Imagenet. What has the field discovered in the five subsequent years? Against a background of considerable progress in areas such as speech recognition, image recognition, and game playing, and considerable enthusiasm in the popular press, I present ten concerns for deep learning, and suggest that deep learning must be supplemented by other techniques if we are to reach artificial general intelligence.

deep learning, neural network, survey article, (14 more...)

arXiv.org Machine Learning

1801.00631

Country:

North America > United States (0.28)
Europe (0.28)

Genre: Research Report (0.64)

Industry:

Transportation (1.00)
Leisure & Entertainment > Games (1.00)
Information Technology (1.00)
(4 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Beyond the Turing Test

Marcus, Gary (New York University) | Rossi, Francesca (University of Padova) | Veloso, Manuela (Carnegie Mellon University)

AI MagazineApr-13-2016

The articles in this special issue of AI Magazine include those that propose specific tests, and those that look at the challenges inherent in building robust, valid, and reliable tests for advancing the state of the art in AI.

artificial intelligence, special issue

Genre: Collection > Journal > Special Issue (1.00)

Technology: Information Technology > Artificial Intelligence > Issues > Turing's Test (0.40)

Beyond the Turing Test

Marcus, Gary (New York University) | Rossi, Francesca (University of Padova) | Veloso, Manuela (Carnegie Mellon University)

AI MagazineApr-13-2016

Within the field, the test is widely recognized as a pioneering landmark, but also is now seen as a distraction, designed over half a century ago, and too crude to really measure intelligence. Intelligence is, after all, a multidimensional variable, and no one test could possibly ever be definitive truly to measure it. Moreover, the original test, at least in its standard implementations, has turned out to be highly gameable, arguably an exercise in deception rather than a true measure of anything especially correlated with intelligence. The much ballyhooed 2015 Turing test winner Eugene Goostman, for instance, pretends to be a thirteen-year-old foreigner and proceeds mainly by ducking questions and returning canned one-liners; it cannot see, it cannot think, and it is certainly a long way from genuine artificial general intelligence.

neural network, special issue, turing test, (19 more...)

Country: North America > United States > Texas (0.15)

Genre:

Collection > Journal > Special Issue (0.33)
Personal > Honors (0.30)

Industry: Health & Medicine (0.49)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Issues > Turing's Test (0.68)

Toward a Comprehension Challenge, Using Crowdsourcing as a Tool

Paritosh, Praveen (Google, Inc.) | Marcus, Gary (New York University)

AI MagazineApr-13-2016

Human readers comprehend vastly more, and in vastly different ways, than any existing comprehension test would suggest. An ideal comprehension test for a story should cover the full range of questions and answers that humans would expect other humans to reasonably learn or infer from a given story. As a step toward these goals we propose a novel test, the Crowdsourced Comprehension Challenge (C3), which is constructed by repeated runs of a three-person game, the Iterative Crowdsourced Comprehension Game (ICCG). ICCG uses structured crowdsourcing to comprehensively generate relevant questions and supported answers for arbitrary stories, whether fiction or nonfiction, presented across a variety of media such as videos, podcasts, and still images.

comprehension, crowdsourcing, neural network, (24 more...)

Country: North America > United States > California (0.28)

Industry:

Education > Educational Setting (0.68)
Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Communications > Social Media > Crowdsourcing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Reports on the 2015 AAAI Workshop Program

Albrecht, Stefano V. (University of Edinburgh) | Beck, J. Christopher (University of Toronto) | Buckeridge, David L. (McGill University) | Botea, Adi (IBM Research, Dublin) | Caragea, Cornelia (University of North Texas) | Chi, Chi-hung (Commonwealth Scientific and Industrial Research Organisation) | Damoulas, Theodoros (New York University) | Dilkina, Bistra (Georgia Institute of Technology) | Eaton, Eric (University of Pennsylvania) | Fazli, Pooyan (Carnegie Mellon University) | Ganzfried, Sam (Carnegie Mellon University) | Giles, C. Lee (Pennsylvania State University) | Guillet, Sébastian (Université du Québec) | Holte, Robert (University of Alberta) | Hutter, Frank (University of Freiburg) | Koch, Thorsten (TU Berlin) | Leonetti, Matteo (University of Texas at Austin) | Lindauer, Marius (University of Freiburg) | Machado, Marlos C. (University of Alberta) | Malitsky, Yui (IBM Research) | Marcus, Gary (New York University) | Meijer, Sebastiaan (KTH Royal Institute of Technology) | Rossi, Francesca (University of Padova, Italy) | Shaban-Nejad, Arash (University of California, Berkeley) | Thiebaux, Sylvie (Australian National University) | Veloso, Manuela (Carnegie Mellon University) | Walsh, Toby (NICTA) | Wang, Can (Commonwealth Scientific and Industrial Research Organisation) | Zhang, Jie (Nanyang Technological University) | Zheng, Yu (Microsoft Research)

AI MagazineJul-16-2015

AAAI's 2015 Workshop Program was held Sunday and Monday, January 25–26, 2015 at the Hyatt Regency Austin Hotel in Austion, Texas, USA. The AAAI-15 workshop program included 15 workshops covering a wide range of topics in artificial intelligence. Most workshops were held on a single day. The titles of the workshops included AI and Ethics, AI for Cities, AI for Transportation: Advice, Interactivity and Actor Modeling, Algorithm Configuration, Artificial Intelligence Applied to Assistive Technologies and Smart Environments, Beyond the Turing Test, Computational Sustainability, Computer Poker and Imperfect Information, Incentive and Trust in E-Communities, Multiagent Interaction without Prior Coordination, Planning, Search, and Optimization, Scholarly Big Data: AI Perspectives, Challenges, and Ideas, Trajectory-Based Behaviour Analytics, World Wide Web and Public Health Intelligence, Knowledge, Skill, and Behavior Transfer in Autonomous Robots, and Learning for General Competency in Video Games.

Computer Engineering, Health & Medicine, workshop, (5 more...)

Industry:

Leisure & Entertainment > Games (0.68)
Information Technology > Robotics & Automation (0.68)
Health & Medicine > Public Health (0.68)

Technology:

Information Technology > Data Science > Data Mining (0.68)
Information Technology > Artificial Intelligence > Robots (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)
Information Technology > Artificial Intelligence > Issues (0.68)