AITopics

2504.01222

Country: Europe > Netherlands (0.14)

Genre: Research Report > Experimental Study (0.68)

Industry: Energy (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.46)

VisuoThink: Empowering LVLM Reasoning with Multimodal Tree Search

Wang, Yikun, Wang, Siyin, Cheng, Qinyuan, Fei, Zhaoye, Ding, Liang, Guo, Qipeng, Tao, Dacheng, Qiu, Xipeng

Recent advancements in Large Vision-Language Models have showcased remarkable capabilities. However, they often falter when confronted with complex reasoning tasks that humans typically address through visual aids and deliberate, step-by-step thinking. While existing methods have explored text-based slow thinking or rudimentary visual assistance, they fall short of capturing the intricate, interleaved nature of human visual-verbal reasoning processes. To overcome these limitations and inspired by the mechanisms of slow thinking in human cognition, we introduce VisuoThink, a novel framework that seamlessly integrates visuospatial and linguistic domains. VisuoThink facilitates multimodal slow thinking by enabling progressive visual-textual reasoning and incorporates test-time scaling through look-ahead tree search. Extensive experiments demonstrate that VisuoThink significantly enhances reasoning capabilities via inference-time scaling, even without fine-tuning, achieving state-of-the-art performance in tasks involving geometry and spatial reasoning.

large language model, machine learning, natural language, (20 more...)

2504.0913

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Rachmut, Ben, Zivan, Roie, Yeoh, William

Latency-Aware 2-Opt Monotonic Local Search for Distributed Constraint Optimization

Researchers recently extended Distributed Constraint Optimization Problems (DCOPs) to Communication-Aware DCOPs so that they are applicable in scenarios in which messages can be arbitrarily delayed. Distributed asynchronous local search and inference algorithms designed for CA-DCOPs are less vulnerable to message latency than their counterparts for regular DCOPs. However, unlike local search algorithms for (regular) DCOPs that converge to k-opt solutions (with k > 1), that is, they converge to solutions that cannot be improved by a group of k agents), local search CA-DCOP algorithms are limited to 1-opt solutions only. In this paper, we introduce Latency-Aware Monotonic Distributed Local Search-2 (LAMDLS-2), where agents form pairs and coordinate bilateral assignment replacements. LAMDLS-2 is monotonic, converges to a 2-opt solution, and is also robust to message latency, making it suitable for CA-DCOPs. Our results indicate that LAMDLS-2 converges faster than MGM-2, a benchmark algorithm, to a similar 2-opt solution, in various message latency scenarios.

agent, artificial intelligence, constraint-based reasoning, (14 more...)

doi: 10.4230/LIPIcs.CP.2024.24

2504.08737

Genre:

Research Report > New Finding (0.87)
Research Report > Experimental Study (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Ekanayake, E. M. H. E. B., Shukla, N.

Different Paths, Same Destination: Designing New Physics-Inspired Dynamical Systems with Engineered Stability to Minimize the Ising Hamiltonian

Oscillator Ising machines (OIMs) represent an exemplar case of using physics-inspired non-linear dynamical systems to solve computationally challenging combinatorial optimization problems (COPs). The computational performance of such systems is highly sensitive to the underlying dynamical properties, the topology of the input graph, and their relative compatibility. In this work, we explore the concept of designing different dynamical systems that minimize the same objective function but exhibit drastically different dynamical properties. Our goal is to leverage this diversification in dynamics to reduce the sensitivity of the computational performance to the underlying graph, and subsequently, enhance the overall effectiveness of such physics-based computational methods. To this end, we introduce a novel dynamical system, the Dynamical Ising Machine (DIM), which, like the OIM, minimizes the Ising Hamiltonian but offers significantly different dynamical properties. We analyze the characteristic properties of the DIM and compare them with those of the OIM. We also show that the relative performance of each model is dependent on the input graph. Our work illustrates that using multiple dynamical systems with varying properties to solve the same COP enables an effective method that is less sensitive to the input graph, while producing robust solutions.

artificial intelligence, optimization problem, scientific computing, (19 more...)

2504.0628

Country: North America > United States > Virginia (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Scientific Computing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.35)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.35)

Structuring Scientific Innovation: A Framework for Modeling and Discovering Impactful Knowledge Combinations

Chen, Junlan, Zhang, Kexin, Li, Daifeng, Feng, Yangyang, Zhang, Yuxuan, Deng, Bowen

The emergence of large language models (LLMs) offers new possibilities for structured exploration of scientific knowledge. Rather than viewing scientific discovery as isolated ideas or content, we propose a structured approach that emphasizes the role of method combinations in shaping disruptive insights. Specifically, we investigate how knowledge units--especially those tied to methodological design--can be modeled and recombined to yield research breakthroughs. Our proposed framework addresses two key challenges. First, we introduce a contrastive learning-based mechanism to identify distinguishing features of historically disruptive method combinations within problem-driven contexts. Second, we propose a reasoning-guided Monte Carlo search algorithm that leverages the chain-of-thought capability of LLMs to identify promising knowledge recombinations for new problem statements. Empirical studies across multiple domains show that the framework is capable of modeling the structural dynamics of innovation and successfully highlights combinations with high disruptive potential. This research provides a new path for computationally guided scientific ideation grounded in structured reasoning and historical data modeling.

large language model, machine learning, natural language, (18 more...)

2503.18865

Genre:

Research Report > New Finding (0.69)
Research Report > Experimental Study (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

arXiv.org Artificial IntelligenceApr-14-2025

II-NVM: Enhancing Map Accuracy and Consistency with Normal Vector-Assisted Mapping

Zhao, Chengwei, Li, Yixuan, Jian, Yina, Xu, Jie, Wang, Linji, Ma, Yongxin, Jin, Xinglai

SLAM technology plays a crucial role in indoor mapping and localization. A common challenge in indoor environments is the "double-sided mapping issue", where closely positioned walls, doors, and other surfaces are mistakenly identified as a single plane, significantly hindering map accuracy and consistency. To address this issue this paper introduces a SLAM approach that ensures accurate mapping using normal vector consistency. We enhance the voxel map structure to store both point cloud data and normal vector information, enabling the system to evaluate consistency during nearest neighbor searches and map updates. This process distinguishes between the front and back sides of surfaces, preventing incorrect point-to-plane constraints. Moreover, we implement an adaptive radius KD-tree search method that dynamically adjusts the search radius based on the local density of the point cloud, thereby enhancing the accuracy of normal vector calculations. To further improve realtime performance and storage efficiency, we incorporate a Least Recently Used (LRU) cache strategy, which facilitates efficient incremental updates of the voxel map. The code is released as open-source and validated in both simulated environments and real indoor scenarios. Experimental results demonstrate that this approach effectively resolves the "double-sided mapping issue" and significantly improves mapping precision. Additionally, we have developed and open-sourced the first simulation and real world dataset specifically tailored for the "double-sided mapping issue".

artificial intelligence, information retrieval, natural language, (14 more...)

2504.08204

Country:

Asia > China (0.46)
North America > United States (0.28)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.55)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.54)

arXiv.org Artificial IntelligenceApr-14-2025

The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search

Yamada, Yutaro, Lange, Robert Tjarko, Lu, Cong, Hu, Shengran, Lu, Chris, Foerster, Jakob, Clune, Jeff, Ha, David

AI is increasingly playing a pivotal role in transforming how scientific discoveries are made. We introduce The AI Scientist-v2, an end-to-end agentic system capable of producing the first entirely AI generated peer-review-accepted workshop paper. This system iteratively formulates scientific hypotheses, designs and executes experiments, analyzes and visualizes data, and autonomously authors scientific manuscripts. Compared to its predecessor (v1, Lu et al., 2024 arXiv:2408.06292), The AI Scientist-v2 eliminates the reliance on human-authored code templates, generalizes effectively across diverse machine learning domains, and leverages a novel progressive agentic tree-search methodology managed by a dedicated experiment manager agent. Additionally, we enhance the AI reviewer component by integrating a Vision-Language Model (VLM) feedback loop for iterative refinement of content and aesthetics of the figures. We evaluated The AI Scientist-v2 by submitting three fully autonomous manuscripts to a peer-reviewed ICLR workshop. Notably, one manuscript achieved high enough scores to exceed the average human acceptance threshold, marking the first instance of a fully AI-generated paper successfully navigating a peer review. This accomplishment highlights the growing capability of AI in conducting all aspects of scientific research. We anticipate that further advancements in autonomous scientific discovery technologies will profoundly impact human knowledge generation, enabling unprecedented scalability in research productivity and significantly accelerating scientific breakthroughs, greatly benefiting society at large. We have open-sourced the code at https://github.com/SakanaAI/AI-Scientist-v2 to foster the future development of this transformative technology. We also discuss the role of AI in science, including AI safety.

ai scientist-v2, artificial intelligence, machine learning, (14 more...)

2504.08066

Genre:

Summary/Review (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Gandotra, Nayesha, Veerapaneni, Rishi, Saleem, Muhammad Suhail, Harabor, Daniel, Li, Jiaoyang, Likhachev, Maxim

Anytime Single-Step MAPF Planning with Anytime PIBT

arXiv.org Artificial IntelligenceApr-11-2025

PIBT is a popular Multi-Agent Path Finding (MAPF) method at the core of many state-of-the-art MAPF methods including LaCAM, CS-PIBT, and WPPL. The main utility of PIBT is that it is a very fast and effective single-step MAPF solver and can return a collision-free single-step solution for hundreds of agents in less than a millisecond. However, the main drawback of PIBT is that it is extremely greedy in respect to its priorities and thus leads to poor solution quality. Additionally, PIBT cannot use all the planning time that might be available to it and returns the first solution it finds. We thus develop Anytime PIBT, which quickly finds a one-step solution identically to PIBT but then continuously improves the solution in an anytime manner. We prove that Anytime PIBT converges to the optimal solution given sufficient time. We experimentally validate that Anytime PIBT can rapidly improve single-step solution quality within milliseconds and even find the optimal single-step action. However, we interestingly find that improving the single-step solution quality does not have a significant effect on full-horizon solution costs.

agent, anytime pibt, artificial intelligence, (13 more...)

2504.07841

Country:

Europe (0.68)
North America > United States (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.31)

Akhlaghi, Vahid Eghbal, Zandehshahvar, Reza, Van Hentenryck, Pascal

PROPEL: Supervised and Reinforcement Learning for Large-Scale Supply Chain Planning

arXiv.org Artificial IntelligenceApr-11-2025

This paper considers how to fuse Machine Learning (ML) and optimization to solve large-scale Supply Chain Planning (SCP) optimization problems. These problems can be formulated as MIP models which feature both integer (non-binary) and continuous variables, as well as flow balance and capacity constraints. This raises fundamental challenges for existing integrations of ML and optimization that have focused on binary MIPs and graph problems. To address these, the paper proposes PROPEL, a new framework that combines optimization with both supervised and Deep Reinforcement Learning (DRL) to reduce the size of search space significantly. PROPEL uses supervised learning, not to predict the values of all integer variables, but to identify the variables that are fixed to zero in the optimal solution, leveraging the structure of SCP applications. PROPEL includes a DRL component that selects which fixed-at-zero variables must be relaxed to improve solution quality when the supervised learning step does not produce a solution with the desired optimality tolerance. PROPEL has been applied to industrial supply chain planning optimizations with millions of variables. The computational results show dramatic improvements in solution times and quality, including a 60% reduction in primal integral and an 88% primal gap reduction, and improvement factors of up to 13.57 and 15.92, respectively.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

2504.07383

Country:

Europe > Switzerland (0.28)
Europe > Austria (0.28)

Genre: Research Report > New Finding (0.34)

Industry: Energy > Power Industry (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
(2 more...)

arXiv.org Artificial IntelligenceApr-11-2025

CLEAR: Contrasting Textual Feedback with Experts and Amateurs for Reasoning

Rufail, Andrew, Kim, Daniel, O'Brien, Sean, Zhu, Kevin

We introduce CLEAR (Contrasting Textual Feedback with Experts and Amateurs for Reasoning), a novel approach to language model reasoning that leverages the strengths of a larger (expert) model and smaller (amateur) model. The expert and amateur models each provide feedback on a model's initial output and are contrasted with each other into refined feedback. This feedback is subsequently applied to iteratively improve CLEAR's responses. Our experiments demonstrate that CLEAR outperforms state-of-the-art methods in several challenging reasoning tasks, including story outline improvement (up to 19.6% relative increase in interestingness), constrained generation (up to 18.5% increase in coverage), mathematical reasoning (up to 6.7% improvement in accuracy) and mitigation of toxicity (decrease of up to 22% in toxicity).

large language model, machine learning, natural language, (15 more...)

2504.07116

Genre: Research Report > Promising Solution (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)