AITopics | Problem Solving

Collaborating Authors

Problem Solving

News Overviews Instructional Materials AI-Alerts Classics

Tracing and Manipulating Intermediate Values in Neural Math Problem Solvers

Matsumoto, Yuta, Heinzerling, Benjamin, Yoshikawa, Masashi, Inui, Kentaro

arXiv.org Artificial IntelligenceJan-17-2023

How language models process complex input that requires multiple steps of inference is not well understood. Previous research has shown that information about intermediate values of these inputs can be extracted from the activations of the models, but it is unclear where that information is encoded and whether that information is indeed used during inference. We introduce a method for analyzing how a Transformer model processes these inputs by focusing on simple arithmetic problems and their intermediate values. To trace where information about intermediate values is encoded, we measure the correlation between intermediate values and the activations of the model using principal component analysis (PCA). Then, we perform a causal intervention by manipulating model weights. This intervention shows that the weights identified via tracing are not merely correlated with intermediate values, but causally related to model predictions. Our findings show that the model has a locality to certain intermediate values, and this is useful for enhancing the interpretability of the models.

artificial intelligence, intermediate value, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2301.06758

Country:

Asia > Japan > Honshū > Tōhoku (0.06)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.34)

Add feedback

Neuro-Symbolic World Models for Adapting to Open World Novelty

Balloch, Jonathan, Lin, Zhiyu, Wright, Robert, Peng, Xiangyu, Hussain, Mustafa, Srinivas, Aarun, Kim, Julia, Riedl, Mark O.

arXiv.org Artificial IntelligenceJan-16-2023

Open-world novelty--a sudden change in the mechanics or properties of an environment--is a common occurrence in the real world. Novelty adaptation is an agent's ability to improve its policy performance post-novelty. Most reinforcement learning (RL) methods assume that the world is a closed, fixed process. Consequentially, RL policies adapt inefficiently to novelties. To address this, we introduce WorldCloner, an end-to-end trainable neuro-symbolic world model for rapid novelty adaptation. WorldCloner learns an efficient symbolic representation of the pre-novelty environment transitions, and uses this transition model to detect novelty and efficiently adapt to novelty in a single-shot fashion. Additionally, WorldCloner augments the policy learning process using imagination-based adaptation, where the world model simulates transitions of the post-novelty environment to help the policy adapt. By blending ''imagined'' transitions with interactions in the post-novelty environment, performance can be recovered with fewer total environment interactions. Using environments designed for studying novelty in sequential decision-making problems, we show that the symbolic world model helps its neural policy adapt more efficiently than model-based and model-based neural-only reinforcement learning methods.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2301.06294

Country: North America > United States (0.05)

Genre: Research Report > New Finding (0.46)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback

Top-Down Synthesis for Library Learning

Bowers, Matthew, Olausson, Theo X., Wong, Lionel, Grand, Gabriel, Tenenbaum, Joshua B., Ellis, Kevin, Solar-Lezama, Armando

arXiv.org Artificial IntelligenceJan-15-2023

This paper introduces corpus-guided top-down synthesis as a mechanism for synthesizing library functions that capture common functionality from a corpus of programs in a domain specific language (DSL). The algorithm builds abstractions directly from initial DSL primitives, using syntactic pattern matching of intermediate abstractions to intelligently prune the search space and guide the algorithm towards abstractions that maximally capture shared structures in the corpus. We present an implementation of the approach in a tool called Stitch and evaluate it against the state-of-the-art deductive library learning algorithm from DreamCoder. Our evaluation shows that Stitch is 3-4 orders of magnitude faster and uses 2 orders of magnitude less memory while maintaining comparable or better library quality (as measured by compressivity). We also demonstrate Stitch's scalability on corpora containing hundreds of complex programs that are intractable with prior deductive approaches and show empirically that it is robust to terminating the search procedure early -- further allowing it to scale to challenging datasets by means of early stopping.

artificial intelligence, machine learning, top-down synthesis, (1 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3571234

2211.16605

Genre: Research Report (0.40)

Industry: Education > Educational Setting > Continuing Education (0.60)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.53)
Information Technology > Artificial Intelligence > Machine Learning (0.53)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.53)

Add feedback

NASA unveils plan for next-gen telescope to search space for signs of life: reports

FOX NewsJan-14-2023, 18:46:46 GMT

Veteran NASA astronaut Tom Jones recaps the historic Artemis I mission after the Orion capsule made a successful return to earth and outlines what this means for the lunar return program. The Habitable Worlds Observatory was announced Monday at the latest American Astronomical Society meeting, and its goal is searching for signs of life on habitable exoplanets. Space.com said on Friday that the observatory will need a powerful coronograph, which is an instrument that allows scientists to study faint objects. Mark Clampin, the director of NASA's astrophysics division, reportedly said that the agency would approach the project as if it faced a strict launch window, building on previous technology used for the Nancy Grace Roman Space Telescope as well as Webb. FILE - In this April 13, 2017, photo provided by NASA, technicians lift the mirror of the James Webb Space Telescope using a crane at the Goddard Space Flight Center in Greenbelt, Maryland.

artificial intelligence, nasa, space telescope, (7 more...)

FOX News

Country: North America > United States > Maryland > Prince George's County > Greenbelt (0.29)

Industry:

Government > Space Agency (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.40)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.40)

Add feedback

World Models and Predictive Coding for Cognitive and Developmental Robotics: Frontiers and Challenges

Taniguchi, Tadahiro, Murata, Shingo, Suzuki, Masahiro, Ognibene, Dimitri, Lanillos, Pablo, Ugur, Emre, Jamone, Lorenzo, Nakamura, Tomoaki, Ciria, Alejandra, Lara, Bruno, Pezzulo, Giovanni

arXiv.org Artificial IntelligenceJan-14-2023

Creating autonomous robots that can actively explore the environment, acquire knowledge and learn skills continuously is the ultimate achievement envisioned in cognitive and developmental robotics. Their learning processes should be based on interactions with their physical and social world in the manner of human learning and cognitive development. Based on this context, in this paper, we focus on the two concepts of world models and predictive coding. Recently, world models have attracted renewed attention as a topic of considerable interest in artificial intelligence. Cognitive systems learn world models to better predict future sensory observations and optimize their policies, i.e., controllers. Alternatively, in neuroscience, predictive coding proposes that the brain continuously predicts its inputs and adapts to model its own dynamics and control behavior in its environment. Both ideas may be considered as underpinning the cognitive development of robots and humans capable of continual or lifelong learning. Although many studies have been conducted on predictive coding in cognitive robotics and neurorobotics, the relationship between world model-based approaches in AI and predictive coding in robotics has rarely been discussed. Therefore, in this paper, we clarify the definitions, relationships, and status of current research on these topics, as well as missing pieces of world models and predictive coding in conjunction with crucially related concepts such as the free-energy principle and active inference in the context of cognitive and developmental robotics. Furthermore, we outline the frontiers and challenges involved in world models and predictive coding toward the further integration of AI and robotics, as well as the creation of robots with real cognitive and developmental capabilities in the future.

artificial intelligence, machine learning, world model, (16 more...)

arXiv.org Artificial Intelligence

2301.05832

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Italy (0.04)
(9 more...)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.67)

Industry:

Leisure & Entertainment > Games (1.00)
Law > Litigation (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
(3 more...)

Add feedback

Infusing Commonsense World Models with Graph Knowledge

Gurung, Alexander, Komeili, Mojtaba, Szlam, Arthur, Weston, Jason, Urbanek, Jack

arXiv.org Artificial IntelligenceJan-13-2023

While language models have become more capable of producing compelling language, we find there are still gaps in maintaining consistency, especially when describing events in a dynamically changing world. We study the setting of generating narratives in an open world text adventure game, where a graph representation of the underlying game state can be used to train models that consume and output both grounded graph representations and natural language descriptions and actions. We build a large set of tasks by combining crowdsourced and simulated gameplays with a novel dataset of complex actions in order to to construct such models. We find it is possible to improve the consistency of action narration models by training on graph contexts and targets, even if graphs are not present at test time. This is shown both in automatic metrics and human evaluations. We plan to release our code, the new set of tasks, and best performing models.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2301.05746

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
Asia > China (0.04)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Games > Computer Games (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(2 more...)

Add feedback

Imitation Learning-based Implicit Semantic-aware Communication Networks: Multi-layer Representation and Collaborative Reasoning

Xiao, Yong, Sun, Zijian, Shi, Guangming, Niyato, Dusit

arXiv.org Artificial IntelligenceJan-13-2023

Semantic communication has recently attracted significant interest from both industry and academia due to its potential to transform the existing data-focused communication architecture towards a more generally intelligent and goal-oriented semantic-aware networking system. Despite its promising potential, semantic communications and semantic-aware networking are still at their infancy. Most existing works focus on transporting and delivering the explicit semantic information, e.g., labels or features of objects, that can be directly identified from the source signal. The original definition of semantics as well as recent results in cognitive neuroscience suggest that it is the implicit semantic information, in particular the hidden relations connecting different concepts and feature items that plays the fundamental role in recognizing, communicating, and delivering the real semantic meanings of messages. Motivated by this observation, we propose a novel reasoning-based implicit semantic-aware communication network architecture that allows multiple tiers of CDC and edge servers to collaborate and support efficient semantic encoding, decoding, and interpretation for end-users. We introduce a new multi-layer representation of semantic information taking into consideration both the hierarchical structure of implicit semantics as well as the personalized inference preference of individual users. We model the semantic reasoning process as a reinforcement learning process and then propose an imitation-based semantic reasoning mechanism learning (iRML) solution for the edge servers to leaning a reasoning policy that imitates the inference behavior of the source user. A federated GCN-based collaborative reasoning solution is proposed to allow multiple edge servers to jointly construct a shared semantic interpretation model based on decentralized knowledge datasets.

artificial intelligence, learning-based implicit semantic-aware communication network, machine learning, (1 more...)

arXiv.org Artificial Intelligence

2210.16118

Genre: Research Report (0.40)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.53)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications > Networks (0.89)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.53)

Add feedback

Neuro-Symbolic Spatio-Temporal Reasoning

Lee, Jae Hee, Sioutis, Michael, Ahrens, Kyra, Alirezaie, Marjan, Kerzel, Matthias, Wermter, Stefan

arXiv.org Artificial IntelligenceJan-13-2023

Knowledge about space and time is necessary to solve problems in the physical world: An AI agent situated in the physical world and interacting with objects often needs to reason about positions of and relations between objects; and as soon as the agent plans its actions to solve a task, it needs to consider the temporal aspect (e.g., what actions to perform over time). Spatio-temporal knowledge, however, is required beyond interacting with the physical world, and is also often transferred to the abstract world of concepts through analogies and metaphors (e.g., "a threat that is hanging over our heads"). As spatial and temporal reasoning is ubiquitous, different attempts have been made to integrate this into AI systems. In the area of knowledge representation, spatial and temporal reasoning has been largely limited to modeling objects and relations and developing reasoning methods to verify statements about objects and relations. On the other hand, neural network researchers have tried to teach models to learn spatial relations from data with limited reasoning capabilities. Bridging the gap between these two approaches in a mutually beneficial way could allow us to tackle many complex real-world problems, such as natural language processing, visual question answering, and semantic image segmentation. In this chapter, we view this integration problem from the perspective of Neuro-Symbolic AI. Specifically, we propose a synergy between logical reasoning and machine learning that will be grounded on spatial and temporal knowledge. Describing some successful applications, remaining challenges, and evaluation datasets pertaining to this direction is the main topic of this contribution.

artificial intelligence, machine learning, temporal reasoning, (16 more...)

arXiv.org Artificial Intelligence

2211.15566

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Austria > Vienna (0.14)
North America > United States > Kansas (0.04)
(6 more...)

Genre:

Research Report (0.50)
Overview (0.46)

Industry:

Transportation > Ground > Road (0.93)
Automobiles & Trucks (0.68)
Transportation > Passenger (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Temporal Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback

See, Think, Confirm: Interactive Prompting Between Vision and Language Models for Knowledge-based Visual Reasoning

Chen, Zhenfang, Zhou, Qinhong, Shen, Yikang, Hong, Yining, Zhang, Hao, Gan, Chuang

arXiv.org Artificial IntelligenceJan-12-2023

Large pre-trained vision and language models have demonstrated remarkable capacities for various tasks. However, solving the knowledge-based visual reasoning tasks remains challenging, which requires a model to comprehensively understand image content, connect the external world knowledge, and perform step-by-step reasoning to answer the questions correctly. To this end, we propose a novel framework named Interactive Prompting Visual Reasoner (IPVR) for few-shot knowledge-based visual reasoning. IPVR contains three stages, see, think and confirm. The see stage scans the image and grounds the visual concept candidates with a visual perception model. The think stage adopts a pre-trained large language model (LLM) to attend to the key concepts from candidates adaptively. It then transforms them into text context for prompting with a visual captioning model and adopts the LLM to generate the answer. The confirm stage further uses the LLM to generate the supporting rationale to the answer, verify the generated rationale with a cross-modality classifier and ensure that the rationale can infer the predicted output consistently. We conduct experiments on a range of knowledge-based visual reasoning datasets. We found our IPVR enjoys several benefits, 1). it achieves better performance than the previous few-shot learning baselines; 2). it enjoys the total transparency and trustworthiness of the whole reasoning process by providing rationales for each reasoning step; 3). it is computation-efficient compared with other fine-tuning baselines.

artificial intelligence, large language model, natural language, (20 more...)

arXiv.org Artificial Intelligence

2301.05226

Country: North America > United States > California > Los Angeles County > Los Angeles (0.14)

Genre: Research Report (1.00)

Industry:

Consumer Products & Services > Restaurants (1.00)
Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.86)

Add feedback

Estimation of User's World Model Using Graph2vec

Sakai, Tatsuya, Nagai, Takayuki

arXiv.org Artificial IntelligenceJan-10-2023

To obtain advanced interaction between autonomous robots and users, robots should be able to distinguish their state space representations (i.e., world models). Herein, a novel method was proposed for estimating the user's world model based on queries. In this method, the agent learns the distributed representation of world models using graph2vec and generates concept activation vectors that represent the meaning of queries in the latent space. Experimental results revealed that the proposed method can estimate the user's world model more efficiently than the simple method of using the ``AND'' search of queries.

artificial intelligence, query, world model, (17 more...)

arXiv.org Artificial Intelligence

2301.03793

Country: Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback