AITopics | Problem Solving

Collaborating Authors

Problem Solving

News Overviews Instructional Materials AI-Alerts Classics

FGeo-TP: A Language Model-Enhanced Solver for Geometry Problems

He, Yiming, Zou, Jia, Zhang, Xiaokai, Zhu, Na, Leng, Tuo

arXiv.org Artificial IntelligenceFeb-14-2024

The application of contemporary artificial intelligence techniques to address geometric problems and automated deductive proof has always been a grand challenge to the interdiscipline field of mathematics and artificial Intelligence. This is the fourth article in a series of our works, in our previous work, we established of a geometric formalized system known as FormalGeo. Moreover we annotated approximately 7000 geometric problems, forming the FormalGeo7k dataset. Despite the FGPS (Formal Geometry Problem Solver) can achieve interpretable algebraic equation solving and human-like deductive reasoning, it often experiences timeouts due to the complexity of the search strategy. In this paper, we introduced FGeo-TP (Theorem Predictor), which utilizes the language model to predict theorem sequences for solving geometry problems. We compared the effectiveness of various Transformer architectures, such as BART or T5, in theorem prediction, implementing pruning in the search process of FGPS, thereby improving its performance in solving geometry problems. Our results demonstrate a significant increase in the problem-solving rate of the language model-enhanced FGeo-TP on the FormalGeo7k dataset, rising from 39.7% to 80.86%. Furthermore, FGeo-TP exhibits notable reductions in solving time and search steps across problems of varying difficulty levels.

formalgeo7k dataset, theorem, theorem sequence, (10 more...)

arXiv.org Artificial Intelligence

2402.09047

Country: Asia > China > Shanghai > Shanghai (0.05)

Genre: Research Report > New Finding (0.68)

Industry: Education > Educational Setting > K-12 Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback

Towards an Understanding of Stepwise Inference in Transformers: A Synthetic Graph Navigation Model

Khona, Mikail, Okawa, Maya, Hula, Jan, Ramesh, Rahul, Nishi, Kento, Dick, Robert, Lubana, Ekdeep Singh, Tanaka, Hidenori

arXiv.org Artificial IntelligenceFeb-12-2024

Stepwise inference protocols, such as scratchpads and chain-of-thought, help language models solve complex problems by decomposing them into a sequence of simpler subproblems. Despite the significant gain in performance achieved via these protocols, the underlying mechanisms of stepwise inference have remained elusive. To address this, we propose to study autoregressive Transformer models on a synthetic task that embodies the multi-step nature of problems where stepwise inference is generally most useful. Specifically, we define a graph navigation problem wherein a model is tasked with traversing a path from a start to a goal node on the graph. Despite is simplicity, we find we can empirically reproduce and analyze several phenomena observed at scale: (i) the stepwise inference reasoning gap, the cause of which we find in the structure of the training data; (ii) a diversity-accuracy tradeoff in model generations as sampling temperature varies; (iii) a simplicity bias in the model's output; and (iv) compositional generalization and a primacy bias with in-context exemplars. Overall, our work introduces a grounded, synthetic framework for studying stepwise inference and offers mechanistic hypotheses that can lay the foundation for a deeper understanding of this phenomenon.

graph, node, stepwise inference, (12 more...)

arXiv.org Artificial Intelligence

2402.07757

Country:

Europe > Latvia > Lubāna Municipality > Lubāna (0.04)
Asia > Vietnam > Hanoi > Hanoi (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(4 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models

Ye, Jiacheng, Gong, Shansan, Chen, Liheng, Zheng, Lin, Gao, Jiahui, Shi, Han, Wu, Chuan, Li, Zhenguo, Bi, Wei, Kong, Lingpeng

arXiv.org Artificial IntelligenceFeb-12-2024

Diffusion models have gained attention in text processing, offering many potential advantages over traditional autoregressive models. This work explores the integration of diffusion models and Chain-of-Thought (CoT), a well-established technique to improve the reasoning ability in autoregressive language models. We propose Diffusion-of-Thought (DoT), allowing reasoning steps to diffuse over time through the diffusion process. In contrast to traditional autoregressive language models that make decisions in a left-to-right, token-by-token manner, DoT offers more flexibility in the trade-off between computation and reasoning performance. Our experimental results demonstrate the effectiveness of DoT in multi-digit multiplication and grade school math problems. Additionally, DoT showcases promising self-correction abilities and benefits from existing reasoning-enhancing techniques like self-consistency decoding. Our findings contribute to the understanding and development of reasoning capabilities in diffusion language models.

arxiv preprint, diffusion model, language model, (10 more...)

arXiv.org Artificial Intelligence

2402.07754

Country:

Asia > Indonesia > Bali (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
North America > Canada > Quebec > Montreal (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)
(2 more...)

Add feedback

Bootstrapping Developmental AIs: From Simple Competences to Intelligent Human-Compatible AIs

Stefik, Mark, Price, Robert

arXiv.org Artificial IntelligenceFeb-12-2024

Developmental AI is a bootstrapping approach where embodied AIs start with innate competences and learn by interacting with the world. They develop abilities in small steps along a bio-inspired trajectory. However, developmental AIs have not yet reached the abilities of young children. In contrast, mainstream approaches for creating AIs have led to valuable AI systems and impressive feats. These approaches include deep learning and generative approaches (e.g., large language models) and manually constructed symbolic approaches. Manually constructed AIs are brittle even in circumscribed domains. Generative AIs are helpful on average, but they can make strange mistakes and not notice them. They sometimes lack common sense and social alignment. This position paper lays out prospects, gaps, and challenges for augmenting AI mainstream approaches with developmental AI. The ambition is to create data-rich experientially based foundation models and human-compatible, resilient, and trustworthy AIs. This research aims to produce AIs that learn to communicate, establish common ground, read critically, consider the provenance of information, test hypotheses, and collaborate. A virtuous multidisciplinary research cycle has led to developmental AIs with capabilities for multimodal perception, object recognition, and manipulation. Computational models for hierarchical planning, abstraction discovery, curiosity, and language acquisition exist but need to be adapted to an embodied learning approach. They need to bridge competence gaps involving nonverbal communication, speech, reading, and writing. Aspirationally, developmental AIs would learn, share what they learn, and collaborate to achieve high standards. The approach would make the creation of AIs more democratic, enabling more people to train, test, build on, and replicate AIs.

bootstrapping developmental ais, incorporate high-level multi-dimensional data, lawrence erlbaum associate, (16 more...)

arXiv.org Artificial Intelligence

2308.04586

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.28)
North America > United States > New York > New York County > New York City (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
(22 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)

Industry:

Transportation > Ground > Road (1.00)
Media (1.00)
Leisure & Entertainment > Sports (1.00)
(10 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
(8 more...)

Add feedback

BDIQA: A New Dataset for Video Question Answering to Explore Cognitive Reasoning through Theory of Mind

Mao, Yuanyuan, Lin, Xin, Ni, Qin, He, Liang

arXiv.org Artificial IntelligenceFeb-11-2024

As a foundational component of cognitive intelligence, theory of mind (ToM) can make AI more closely resemble human thought processes, thereby enhancing their interaction and collaboration with human. In particular, it can significantly improve a model's comprehension of videos in complex scenes. However, current video question answer (VideoQA) datasets focus on studying causal reasoning within events few of them genuinely incorporating human ToM. Consequently, there is a lack of development in ToM reasoning tasks within the area of VideoQA. This paper presents BDIQA, the first benchmark to explore the cognitive reasoning capabilities of VideoQA models in the context of ToM. BDIQA is inspired by the cognitive development of children's ToM and addresses the current deficiencies in machine ToM within datasets and tasks. Specifically, it offers tasks at two difficulty levels, assessing Belief, Desire and Intention (BDI) reasoning in both simple and complex scenarios. We conduct evaluations on several mainstream methods of VideoQA and diagnose their capabilities with zero shot, few shot and supervised learning. We find that the performance of pre-trained models on cognitive reasoning tasks remains unsatisfactory. To counter this challenge, we undertake thorough analysis and experimentation, ultimately presenting two guidelines to enhance cognitive reasoning derived from ablation analysis.

bdiqa, proc, reasoning, (15 more...)

arXiv.org Artificial Intelligence

2402.07402

Country:

Asia > China > Shanghai > Shanghai (0.04)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.66)
(2 more...)

Add feedback

Do Transformer World Models Give Better Policy Gradients?

Ma, Michel, Ni, Tianwei, Gehring, Clement, D'Oro, Pierluca, Bacon, Pierre-Luc

arXiv.org Artificial IntelligenceFeb-10-2024

A natural approach for reinforcement learning is to predict future rewards by unrolling a neural network world model, and to backpropagate through the resulting computational graph to learn a policy. However, this method often becomes impractical for long horizons since typical world models induce hard-to-optimize loss landscapes. Transformers are known to efficiently propagate gradients over long horizons: could they be the solution to this problem? Surprisingly, we show that commonly-used transformer world models produce circuitous gradient paths, which can be detrimental to long-range policy gradients. To tackle this challenge, we propose a class of world models called Actions World Models (AWMs), designed to provide more direct routes for gradient propagation. We integrate such AWMs into a policy gradient framework that underscores the relationship between network architectures and the policy gradient updates they inherently represent. We demonstrate that AWMs can generate optimization landscapes that are easier to navigate even when compared to those from the simulator itself. This property allows transformer AWMs to produce better policies than competitive baselines in realistic long-horizon tasks.

gradient, policy gradient, world model, (15 more...)

arXiv.org Artificial Intelligence

2402.0529

Country:

North America > Canada > Quebec > Montreal (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(5 more...)

Genre: Research Report > New Finding (0.93)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.47)
Health & Medicine > Therapeutic Area > Immunology (0.47)
Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback

Knowledge Graphs Meet Multi-Modal Learning: A Comprehensive Survey

Chen, Zhuo, Zhang, Yichi, Fang, Yin, Geng, Yuxia, Guo, Lingbing, Chen, Xiang, Li, Qian, Zhang, Wen, Chen, Jiaoyan, Zhu, Yushan, Li, Jiaqi, Liu, Xiaoze, Pan, Jeff Z., Zhang, Ningyu, Chen, Huajun

arXiv.org Artificial IntelligenceFeb-9-2024

Knowledge Graphs (KGs) play a pivotal role in advancing various AI applications, with the semantic web community's exploration into multi-modal dimensions unlocking new avenues for innovation. In this survey, we carefully review over 300 articles, focusing on KG-aware research in two principal aspects: KG-driven Multi-Modal (KG4MM) learning, where KGs support multi-modal tasks, and Multi-Modal Knowledge Graph (MM4KG), which extends KG studies into the MMKG realm. We begin by defining KGs and MMKGs, then explore their construction progress. Our review includes two primary task categories: KG-aware multi-modal learning tasks, such as Image Classification and Visual Question Answering, and intrinsic MMKG tasks like Multi-modal Knowledge Graph Completion and Entity Alignment, highlighting specific research trajectories. For most of these tasks, we provide definitions, evaluation benchmarks, and additionally outline essential insights for conducting relevant research. Finally, we discuss current challenges and identify emerging trends, such as progress in Large Language Modeling and Multi-modal Pre-training strategies. This survey aims to serve as a comprehensive reference for researchers already involved in or considering delving into KG and multi-modal learning research, offering insights into the evolving landscape of MMKG research and supporting future work.

multi-modal data, multi-modal learning, relation extraction, (16 more...)

arXiv.org Artificial Intelligence

2402.05391

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.13)
Europe > France > Bourgogne-Franche-Comté > Doubs > Besançon (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)
(8 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Leisure & Entertainment > Sports (1.00)
Information Technology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
(2 more...)

Technology:

Information Technology > Communications > Web > Semantic Web (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
(10 more...)

Add feedback

Improving Token-Based World Models with Parallel Observation Prediction

Cohen, Lior, Wang, Kaixin, Kang, Bingyi, Mannor, Shie

arXiv.org Artificial IntelligenceFeb-8-2024

Motivated by the success of Transformers when applied to sequences of discrete symbols, token-based world models (TBWMs) were recently proposed as sample-efficient methods. In TBWMs, the world model consumes agent experience as a language-like sequence of tokens, where each observation constitutes a sub-sequence. However, during imagination, the sequential token-by-token generation of next observations results in a severe bottleneck, leading to long training times, poor GPU utilization, and limited representations. To resolve this bottleneck, we devise a novel Parallel Observation Prediction (POP) mechanism. POP augments a Retentive Network (RetNet) with a novel forward mode tailored to our reinforcement learning setting. We incorporate POP in a novel TBWM agent named REM (Retentive Environment Model), showcasing a 15.4x faster imagination compared to prior TBWMs. REM attains superhuman performance on 12 out of 26 games of the Atari 100K benchmark, while training in less than 12 hours. Our code is available at \url{https://github.com/leor-c/REM}.

large language model, machine learning, reinforcement learning, (22 more...)

arXiv.org Artificial Intelligence

2402.05643

Country:

North America > United States (0.28)
Africa > Rwanda (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Materials > Chemicals > Industrial Gases > Liquified Gas (0.46)
Materials > Chemicals > Commodity Chemicals > Petrochemicals > LNG (0.46)
Energy > Oil & Gas > Midstream (0.46)
Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.89)
(3 more...)

Add feedback

Guiding Large Language Models with Divide-and-Conquer Program for Discerning Problem Solving

Zhang, Yizhou, Du, Lun, Cao, Defu, Fu, Qiang, Liu, Yan

arXiv.org Artificial IntelligenceFeb-7-2024

Foundation models, such as Large language Models (LLMs), have attracted significant amount of interest due to their large number of applications. Existing works show that appropriate prompt design, such as Chain-of-Thoughts, can unlock LLM's powerful capacity in diverse areas. However, when handling tasks involving repetitive sub-tasks and/or deceptive contents, such as arithmetic calculation and article-level fake news detection, existing prompting strategies either suffers from insufficient expressive power or intermediate errors triggered by hallucination. To make LLM more discerning to such intermediate errors, we propose to guide LLM with a Divide-and-Conquer program that simultaneously ensures superior expressive power and disentangles task decomposition, sub-task resolution, and resolution assembly process. Theoretic analysis reveals that our strategy can guide LLM to extend the expressive power of fixed-depth Transformer. Experiments indicate that our proposed method can achieve better performance than typical prompting strategies in tasks bothered by intermediate errors and deceptive contents, such as large integer multiplication, hallucination detection and misinformation detection.

discerning problem, divide-and-conquer program, language model, (13 more...)

arXiv.org Artificial Intelligence

2402.05359

Country:

North America > United States > California (0.14)
Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.05)
Asia > China (0.04)
Asia > Taiwan > Taiwan Province > Taipei (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Add feedback

Three Pathways to Neurosymbolic Reinforcement Learning with Interpretable Model and Policy Networks

Graf, Peter, Emami, Patrick

arXiv.org Artificial IntelligenceFeb-7-2024

Neurosymbolic AI combines the interpretability, parsimony, and explicit reasoning of classical symbolic approaches with the statistical learning of data-driven neural approaches. Models and policies that are simultaneously differentiable and interpretable may be key enablers of this marriage. This paper demonstrates three pathways to implementing such models and policies in a real-world reinforcement learning setting. Specifically, we study a broad class of neural networks that build interpretable semantics directly into their architecture. We reveal and highlight both the potential and the essential difficulties of combining logic, simulation, and learning. One lesson is that learning benefits from continuity and differentiability, but classical logic is discrete and non-differentiable. The relaxation to real-valued, differentiable representations presents a trade-off; the more learnable, the less interpretable. Another lesson is that using logic in the context of a numerical simulation involves a non-trivial mapping from raw (e.g., real-valued time series) simulation data to logical predicates. Some open questions this note exposes include: What are the limits of rule-based controllers, and how learnable are they? Do the differentiable interpretable approaches discussed here scale to large, complex, uncertain systems? Can we truly achieve interpretability? We highlight these and other themes across the three approaches.

arxiv preprint arxiv, ddt, emami differentiable, (12 more...)

arXiv.org Artificial Intelligence

2402.05307

Country: North America > United States (1.00)

Genre: Research Report (0.50)

Industry:

Energy > Renewable (1.00)
Government > Regional Government > North America Government > United States Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(4 more...)

Add feedback