AITopics

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.57)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.49)

Neural Information Processing SystemsOct-9-2024, 12:48:57 GMT

Continuous Hierarchical Representations with Poincaré Variational Auto-Encoders

The Variational Auto-Encoder (VAE) is a popular method for learning a generative model and embeddings of the data. Many real datasets are hierarchically structured. We therefore endow VAEs with a Poincaré ball model of hyperbolic geometry as a latent space and rigorously derive the necessary methods to work with two main Gaussian generalisations on that space. We empirically show better generalisation to unseen data than the Euclidean counterpart, and can qualitatively and quantitatively better recover hierarchical structures.

continuous hierarchical representation, generalisation, variational auto-encoder, (1 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.76)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Neural Information Processing SystemsOct-9-2024, 04:12:02 GMT

Reviews: Learning Chordal Markov Networks via Branch and Bound

The authors present a branch and bound algorithm for learning Chordal Markov networks. The prior state of the art algorithm is a dynamic programming approach based on a recursive characterization of clique tress and storing in memory the scores of already-solved subproblems. The proposed algorithm uses a branch and bound algorithm to search for an optimal chordal Markov network. The algorithm first uses a dynamic programming algorithm to enumerate Bayesian network structures, which are later used as pruning bounds. A symmetry breaking technique is introduced to prune the search space.

algorithm, branch and bound, learning chordal markov network, (5 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.92)
(2 more...)

Banihashemi, Bita, De Giacomo, Giuseppe, Lespérance, Yves

Abstracting Situation Calculus Action Theories

arXiv.org Artificial IntelligenceOct-9-2024

We develop a general framework for agent abstraction based on the situation calculus and the ConGolog agent programming language. We assume that we have a high-level specification and a low-level specification of the agent, both represented as basic action theories. A refinement mapping specifies how each high-level action is implemented by a low-level ConGolog program and how each high-level fluent can be translated into a low-level formula. We define a notion of sound abstraction between such action theories in terms of the existence of a suitable bisimulation between their respective models. Sound abstractions have many useful properties that ensure that we can reason about the agent's actions (e.g., executability, projection, and planning) at the abstract level, and refine and concretely execute them at the low level. We also characterize the notion of complete abstraction where all actions (including exogenous ones) that the high level thinks can happen can in fact occur at the low level. To facilitate verifying that one has a sound/complete abstraction relative to a mapping, we provide a set of necessary and sufficient conditions. Finally, we identify a set of basic action theory constraints that ensure that for any low-level action sequence, there is a unique high-level action sequence that it refines. This allows us to track/monitor what the low-level agent is doing and describe it in abstract terms (i.e., provide high-level explanations, for instance, to a client or manager).

abstraction, logic & formal reasoning, natural language, (19 more...)

2410.14712

Country:

Oceania > Australia > Victoria > Melbourne (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
(29 more...)

Genre:

Research Report (0.81)
Overview (0.67)
Workflow (0.55)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

arXiv.org Artificial IntelligenceOct-9-2024

Can Transformers Reason Logically? A Study in SAT Solving

Pan, Leyan, Ganesh, Vijay, Abernethy, Jacob, Esposo, Chris, Lee, Wenke

A PARAT "program" is basically a sequence of array operations over SOps. Throughout this section, we refer to the indices along the first dimension of an SOp as "position" and refer to indices along the second dimension as "dimension". The "inputs" to a program are arbitrary positional encoding and token embedding SOps, represented by the base class names PosEncSOp and TokEmbSOp respectively. For example, the OneHotTokEmb class represents the one-hot embedding of tokens and Indices represents the numerical value of the index of each position. The rest of the program performs various operations that compute new SOps based on existing ones. We provide implementations of basic building block operations including (but not limited to) the following: Mean(q, k, v) Represents the "Averaging Hard Attention" operation.

assignment, formula, opération, (17 more...)

2410.07432

Country:

North America > United States > Ohio > Cuyahoga County > Shaker Heights (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Neural Information Processing SystemsOct-8-2024, 08:47:34 GMT

Reviews: Eigen-Distortions of Hierarchical Representations

The submission presents a method to generate image distortions that are maximally/minimally discriminable in a certain image representation. The maximally/minimally distortion directions are defined as the eigenvectors of the Fisher Information Matrix with largest/smallest eigenvalue. Distortions are generated for image representations in the VGG-16 as well as for representations in models that were trained to predict human sensitivity to image distortions. Human discrimination thresholds for those distortions are measured. It is found that the difference in human discrimination threshold between max and min distortions of the model is largest for a biologically inspired'early vision' model that was trained to predict human sensitivity, compared to a CNN trained to predict human sensitivity or the VGG-16 representations. For the VGG representations it is found that the difference in detection threshold for humans is larger for min/max distortions of earlier layers than for later layers.

distortion, predict human sensitivity, representation, (10 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.76)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.76)

O1 Replication Journey: A Strategic Progress Report -- Part 1

Qin, Yiwei, Li, Xuefeng, Zou, Haoyang, Liu, Yixiu, Xia, Shijie, Huang, Zhen, Ye, Yixin, Yuan, Weizhe, Liu, Hector, Li, Yuanzhi, Liu, Pengfei

This paper introduces a pioneering approach to artificial intelligence research, embodied in our O1 Replication Journey. In response to the announcement of OpenAI's groundbreaking O1 model, we embark on a transparent, real-time exploration to replicate its capabilities while reimagining the process of conducting and communicating AI research. Our methodology addresses critical challenges in modern AI research, including the insularity of prolonged team-based projects, delayed information sharing, and the lack of recognition for diverse contributions. By providing comprehensive, real-time documentation of our replication efforts, including both successes and failures, we aim to foster open science, accelerate collective advancement, and lay the groundwork for AI-driven scientific discovery. Our research progress report diverges significantly from traditional research papers, offering continuous updates, full process transparency, and active community engagement throughout the research journey. Technologically, we proposed the journey learning paradigm, which encourages models to learn not just shortcuts, but the complete exploration process, including trial and error, reflection, and backtracking. With only 327 training samples and without any additional tricks, journey learning outperformed conventional supervised learning by over 8\% on the MATH dataset, demonstrating its extremely powerful potential. We believe this to be the most crucial component of O1 technology that we have successfully decoded. We share valuable resources including technical hypotheses and insights, cognitive exploration maps, custom-developed tools, etc at https://github.com/GAIR-NLP/O1-Journey.

large language model, machine learning, natural language, (17 more...)

2410.18982

Country:

Europe > Austria > Vienna (0.14)
Asia > China > Shanghai > Shanghai (0.04)
North America > United States > New York (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)

Genre:

Research Report > Promising Solution (1.00)
Research Report > New Finding (1.00)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
(2 more...)

Parameter Choice and Neuro-Symbolic Approaches for Deep Domain-Invariant Learning

Dinu, Marius-Constantin

activation dropout fully-connected layer 128, generalist foundation model outcompete special-purpose, neural network meet neural-symbolic computing, (15 more...)

As artificial intelligence (AI) systems advance, we move towards broad AI: systems capable of performing well on diverse tasks, understanding context, and adapting rapidly to new scenarios. A central challenge for broad AI systems is to generalize over tasks in related domains and being robust to distribution shifts. Neuro-symbolic (NeSy) AI bridges the gap between symbolic and sub-symbolic paradigms to address these challenges, enabling adaptable, generalizable, and more interpretable systems. The development of broad AI requires advancements in domain adaptation (DA), enabling models trained on source domains to effectively generalize to unseen target domains. Traditional approaches often rely on parameter optimization and fine-tuning, which can be impractical due to high costs and risks of catastrophic forgetting. NeSy AI systems use multiple models and methods to generalize to unseen domains and maintain performance across varying conditions. We analyze common DA and NeSy approaches with a focus on deep domain-invariant learning, extending to real-world challenges such as adapting to continuously changing domains and handling large domain gaps. We showcase state-of-the-art model-selection methods for scenarios with limited samples and introduce domain-specific adaptations without gradient-based updates for cases where model tuning is infeasible. This work establishes a framework for scalable and generalizable broad AI systems applicable across various problem settings, demonstrating how symbolic reasoning and large language models can build universal computational graphs that generalize across domains and problems, contributing to more adaptable AI approaches for real-world applications.

2410.06235

Country:

Europe > Austria > Vienna (0.13)
Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(9 more...)

Genre:

Research Report > Promising Solution (1.00)
Overview (1.00)

Industry:

Information Technology (1.00)
Automobiles & Trucks (0.92)
Health & Medicine > Therapeutic Area > Neurology (0.92)
(3 more...)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science > Data Mining (1.00)
(12 more...)

Optimizing AI Reasoning: A Hamiltonian Dynamics Approach to Multi-Hop Question Answering

Marin, Javier

This paper introduces an innovative approach to analyzing and improving multi-hop reasoning in AI systems by drawing inspiration from Hamiltonian mechanics. We propose a novel framework that maps reasoning chains in embedding spaces to Hamiltonian systems, allowing us to leverage powerful analytical tools from classical physics. Our method defines a Hamiltonian function that balances the progression of reasoning (kinetic energy) against the relevance to the question at hand (potential energy). Using this framework, we analyze a large dataset of reasoning chains from a multi-hop question-answering task, revealing intriguing patterns that distinguish valid from invalid reasoning. We show that valid reasoning chains have lower Hamiltonian energy and move in ways that make the best trade-off between getting more information and answering the right question. Furthermore, we demonstrate the application of this framework to steer the creation of more efficient reasoning algorithms within AI systems. Our results not only provide new insights into the nature of valid reasoning but also open up exciting possibilities for physics-inspired approaches to understanding and improving artificial intelligence.

reasoning, reasoning process, trajectory, (13 more...)

2410.04415

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > District of Columbia > Washington (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Energy (0.68)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
(2 more...)

Beyond Forecasting: Compositional Time Series Reasoning for End-to-End Task Execution

Ye, Wen, Zhang, Yizhou, Yang, Wei, Tang, Lumingyuan, Cao, Defu, Cai, Jie, Liu, Yan

In recent decades, there has been substantial advances in time series models and benchmarks across various individual tasks, such as time series forecasting, classification, and anomaly detection. Meanwhile, compositional reasoning in time series is prevalent in real-world applications (e.g., decision-making and compositional question answering) and is in great demand. Unlike simple tasks that primarily focus on predictive accuracy, compositional reasoning emphasizes the synthesis of diverse information from both time series data and various domain knowledge, making it distinct and extremely more challenging. In this paper, we introduce Compositional Time Series Reasoning, a new task of handling intricate multistep reasoning tasks from time series data. Specifically, this new task focuses on various question instances requiring structural and compositional reasoning abilities on time series data, such as decision-making and compositional question answering. As an initial attempt to tackle this novel task, we developed TS-Reasoner, a program-aided approach that utilizes large language model (LLM) to decompose a complex task into steps of programs that leverage existing time series models and numerical subroutines. Unlike existing reasoning work which only calls off-the-shelf modules, TS-Reasoner allows for the creation of custom modules and provides greater flexibility to incorporate domain knowledge as well as user-specified constraints. We demonstrate the effectiveness of our method through a comprehensive set of experiments. These promising results indicate potential opportunities in the new task of time series reasoning and highlight the need for further research.

prediction, reasoning, time sery, (11 more...)

2410.04047

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)

Genre: Research Report (0.82)

Industry:

Health & Medicine (1.00)
Energy (1.00)
Banking & Finance > Trading (0.68)
Banking & Finance > Economy (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.93)