AITopics | correct hypothesis

Collaborating Authors

correct hypothesis

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Hypothesis Search: Inductive Reasoning with Language Models

Wang, Ruocheng, Zelikman, Eric, Poesia, Gabriel, Pu, Yewen, Haber, Nick, Goodman, Noah D.

arXiv.org Artificial IntelligenceSep-11-2023

Inductive reasoning is a core problem-solving capacity: humans can identify underlying principles from a few examples, which can then be robustly generalized to novel scenarios. Recent work has evaluated large language models (LLMs) on inductive reasoning tasks by directly prompting them yielding "in context learning." This can work well for straightforward inductive tasks, but performs very poorly on more complex tasks such as the Abstraction and Reasoning Corpus (ARC). In this work, we propose to improve the inductive reasoning ability of LLMs by generating explicit hypotheses at multiple levels of abstraction: we prompt the LLM to propose multiple abstract hypotheses about the problem, in natural language, then implement the natural language hypotheses as concrete Python programs. These programs can be directly verified by running on the observed examples and generalized to novel inputs. Because of the prohibitive cost of generation with state-of-the-art LLMs, we consider a middle step to filter the set of hypotheses that will be implemented into programs: we either ask the LLM to summarize into a smaller set of hypotheses, or ask human annotators to select a subset of the hypotheses. We verify our pipeline's effectiveness on the ARC visual inductive reasoning benchmark, its variant 1D-ARC, and string transformation dataset SyGuS. On a random 40-problem subset of ARC, our automated pipeline using LLM summaries achieves 27.5% accuracy, significantly outperforming the direct prompting baseline (accuracy of 12.5%). With the minimal human input of selecting from LLM-generated candidates, the performance is boosted to 37.5%. (And we argue this is a lower bound on the performance of our approach without filtering.) Our ablation studies show that abstract hypothesis generation and concrete program representations are both beneficial for LLMs to perform inductive reasoning tasks.

grid, hypothesis, language model, (15 more...)

arXiv.org Artificial Intelligence

2309.0566

Country: North America > Montserrat (0.04)

Genre: Research Report (1.00)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback

Where Does My Model Underperform? A Human Evaluation of Slice Discovery Algorithms

Johnson, Nari, Cabrera, Ángel Alexander, Plumb, Gregory, Talwalkar, Ameet

arXiv.org Artificial IntelligenceJun-13-2023

Machine learning (ML) models that achieve high average accuracy can still underperform on semantically coherent subsets (i.e. "slices") of data. This behavior can have significant societal consequences for the safety or bias of the model in deployment, but identifying these underperforming slices can be difficult in practice, especially in domains where practitioners lack access to group annotations to define coherent subsets of their data. Motivated by these challenges, ML researchers have developed new slice discovery algorithms that aim to group together coherent and high-error subsets of data. However, there has been little evaluation focused on whether these tools help humans form correct hypotheses about where (for which groups) their model underperforms. We conduct a controlled user study (N = 15) where we show 40 slices output by two state-of-the-art slice discovery algorithms to users, and ask them to form hypotheses about where an object detection model underperforms. Our results provide positive evidence that these tools provide some benefit over a naive baseline, and also shed light on challenges faced by users during the hypothesis formation step. We conclude by discussing design opportunities for ML and HCI researchers. Our findings point to the importance of centering users when designing and evaluating new tools for slice discovery.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2306.08167

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
Europe > Italy > Tuscany > Florence (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Leisure & Entertainment > Sports > Tennis (0.48)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)

Add feedback

Hybrid Belief Pruning with Guarantees for Viewpoint-Dependent Semantic SLAM

Lemberg, Tuvy, Indelman, Vadim

arXiv.org Artificial IntelligenceJul-19-2022

Semantic simultaneous localization and mapping is a subject of increasing interest in robotics and AI that directly influences the autonomous vehicles industry, the army industries, and more. One of the challenges in this field is to obtain object classification jointly with robot trajectory estimation. Considering view-dependent semantic measurements, there is a coupling between different classes, resulting in a combinatorial number of hypotheses. A common solution is to prune hypotheses that have a sufficiently low probability and to retain only a limited number of hypotheses. However, after pruning and renormalization, the updated probability is overconfident with respect to the original probability. This is especially problematic for systems that require high accuracy. If the prior probability of the classes is independent, the original normalization factor can be computed efficiently without pruning hypotheses. To the best of our knowledge, this is the first work to present these results. If the prior probability of the classes is dependent, we propose a lower bound on the normalization factor that ensures cautious results. The bound is calculated incrementally and with similar efficiency as in the independent case. After pruning and updating based on the bound, this belief is shown empirically to be close to the original belief.

hypothesis, normalization factor, probability, (15 more...)

arXiv.org Artificial Intelligence

2207.09103

Country: Asia > Middle East > Israel > Haifa District > Haifa (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

Add feedback

Interactive Model with Structural Loss for Language-based Abductive Reasoning

Li, Linhao, Xu, Ming, Dong, Yongfeng, Li, Xin, Wang, Ao, Hu, Qinghua

arXiv.org Artificial IntelligenceDec-1-2021

The abductive natural language inference task ($\alpha$NLI) is proposed to infer the most plausible explanation between the cause and the event. In the $\alpha$NLI task, two observations are given, and the most plausible hypothesis is asked to pick out from the candidates. Existing methods model the relation between each candidate hypothesis separately and penalize the inference network uniformly. In this paper, we argue that it is unnecessary to distinguish the reasoning abilities among correct hypotheses; and similarly, all wrong hypotheses contribute the same when explaining the reasons of the observations. Therefore, we propose to group instead of ranking the hypotheses and design a structural loss called ``joint softmax focal loss'' in this paper. Based on the observation that the hypotheses are generally semantically related, we have designed a novel interactive language model aiming at exploiting the rich interaction among competing hypotheses. We name this new model for $\alpha$NLI: Interactive Model with Structural Loss (IMSL). The experimental results show that our IMSL has achieved the highest performance on the RoBERTa-large pretrained model, with ACC and AUC results increased by about 1\% and 5\% respectively.

computational linguistic, hypothesis, information interaction layer, (13 more...)

arXiv.org Artificial Intelligence

2112.00284

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(15 more...)

Genre: Research Report > New Finding (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback

Top Program Construction and Reduction for polynomial time Meta-Interpretive Learning

Patsantzis, Stassa, Muggleton, Stephen H.

arXiv.org Artificial IntelligenceJan-13-2021

Meta-Interpretive Learners, like most ILP systems, learn by searching for a correct hypothesis in the hypothesis space, the powerset of all constructible clauses. We show how this exponentially-growing search can be replaced by the construction of a Top program: the set of clauses in all correct hypotheses that is itself a correct hypothesis. We give an algorithm for Top program construction and show that it constructs a correct Top program in polynomial time and from a finite number of examples. We implement our algorithm in Prolog as the basis of a new MIL system, Louise, that constructs a Top program and then reduces it by removing redundant clauses. We compare Louise to the state-of-the-art search-based MIL system Metagol in experiments on grid world navigation, graph connectedness and grammar learning datasets and find that Louise improves on Metagol's predictive accuracy when the hypothesis space and the target theory are both large, or when the hypothesis space does not include a correct hypothesis because of "classification noise" in the form of mislabelled examples. When the hypothesis space or the target theory are small, Louise and Metagol perform equally well.

correct hypothesis, hypothesis, metagol, (11 more...)

arXiv.org Artificial Intelligence

2101.0505

Country:

Europe > United Kingdom > England > Greater London > London (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)
(2 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.49)

Add feedback

Delegative Reinforcement Learning: learning to avoid traps with a little help

Kosoy, Vanessa

arXiv.org Machine LearningJul-19-2019

Most known regret bounds for reinforcement learning are either episodic or assume an environment without traps. We derive a regret bound without making either assumption, by allowing the algorithm to occasionally delegate an action to an external advisor. We thus arrive at a setting of active one-shot model-based reinforcement learning that we call DRL (delegative reinforcement learning.) The algorithm we construct in order to demonstrate the regret bound is a variant of Posterior Sampling Reinforcement Learning supplemented by a subroutine that decides which actions should be delegated. The algorithm is not anytime, since the parameters must be adjusted according to the target time discount. Currently, our analysis is limited to Markov decision processes with finite numbers of hypotheses, states and actions.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Machine Learning

1907.08461

Country: North America > United States (0.93)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback