AITopics | mile apart

Collaborating Authors

mile apart

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

EHOP: A Dataset of Everyday NP-Hard Optimization Problems

Duchnowski, Alex, Pavlick, Ellie, Koller, Alexander

arXiv.org Artificial IntelligenceFeb-19-2025

We introduce the dataset of Everyday Hard Optimization Problems (EHOP), a collection of NP-hard optimization problems expressed in natural language. EHOP includes problem formulations that could be found in computer science textbooks, versions that are dressed up as problems that could arise in real life, and variants of well-known problems with inverted rules. We find that state-of-the-art LLMs, across multiple prompting strategies, systematically solve textbook problems more accurately than their real-life and inverted counterparts. We argue that this constitutes evidence that LLMs adapt solutions seen during training, rather than leveraging reasoning abilities that would enable them to generalize to novel problems.

commanullseparated list, mile apart, transition, (17 more...)

arXiv.org Artificial Intelligence

2502.13776

Country:

Europe > Germany > Saarland (0.04)
North America > United States > New York (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(9 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.70)

Add feedback

Testing GPT-4 with Wolfram Alpha and Code Interpreter plug-ins on math and science problems

Davis, Ernest, Aaronson, Scott

arXiv.org Artificial IntelligenceAug-14-2023

Our test sets were too small and too haphazard to support statistically valid conclusions, but they were suggestive of a number of conclusions. We summarize these here, and discuss them at greater length in section 7. Over the kinds of problems tested, GPT-4 with either plug-in is significantly stronger than GPT-4 by itself, or, almost certainly, than any AI that existed a year ago. However it is still far from reliable; it often outputs a wrong answer or fails to output any answer. In terms of overall score, we would judge that these systems performs on the level of a middling undergraduate student. However, their capacities and weaknesses do not align with a human student; the systems solve some problems that even capable students would find challenging, whereas they fail on some problems that even middling high school students would find easy.

calculation, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2308.05713

Country:

North America > United States > Michigan (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > Canada > Quebec (0.04)
(40 more...)

Genre: Research Report (0.41)

Industry: Education > Educational Setting > K-12 Education > Secondary School (0.54)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback