AITopics | Search

Collaborating Authors

Search

"Search is a problem-solving technique that systematically explores a space of problem states, i.e., successive and alternative stages in the problem-solving process. Examples of problem states might include the different board configurations in a game or intermediate steps in a reasoning process. This space of alternative solutions is then searched to find an answer. Newell and Simon (1976) have argued that this is the essential basis of human problem solving. Indeed, when a chess player examines the effects of different moves or a doctor considers a number of alternative diagnoses, they are searching among alternatives."
– from Section 1.2 of Chapter One of George F. Luger's textbook, Artificial Intelligence: Structures and Strategies for Complex Problem Solving, 5th Edition (Addison-Wesley; 2005).

News Overviews Instructional Materials AI-Alerts Classics

MPS-Prover: Advancing Stepwise Theorem Proving by Multi-Perspective Search and Data Curation

Liang, Zhenwen, Song, Linfeng, Li, Yang, Yang, Tao, Zhang, Feng, Mi, Haitao, Yu, Dong

arXiv.org Artificial IntelligenceMay-19-2025

Automated Theorem Proving (ATP) in formal languages remains a formidable challenge in AI, demanding rigorous logical deduction and navigating vast search spaces. While large language models (LLMs) have shown promising performance, existing stepwise provers often suffer from biased search guidance, leading to inefficiencies and suboptimal proof strategies. This paper introduces the Multi-Perspective Search Prover (MPS-Prover), a novel stepwise ATP system designed to overcome these limitations. MPS-Prover incorporates two key innovations: a highly effective post-training data curation strategy that prunes approximately 40% of redundant training data without sacrificing performance, and a multi-perspective tree search mechanism. This search integrates a learned critic model with strategically designed heuristic rules to diversify tactic selection, prevent getting trapped in unproductive states, and enhance search robustness. Extensive evaluations demonstrate that MPS-Prover achieves state-of-the-art performance on multiple challenging benchmarks, including miniF2F and ProofNet, outperforming prior 7B parameter models. Furthermore, our analyses reveal that MPS-Prover generates significantly shorter and more diverse proofs compared to existing stepwise and whole-proof methods, highlighting its efficiency and efficacy. Our work advances the capabilities of LLM-based formal reasoning and offers a robust framework and a comprehensive analysis for developing more powerful theorem provers.

large language model, logic & formal reasoning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2505.10962

Country:

Europe > Italy (0.04)
Asia > Middle East > Qatar > Ad-Dawhah > Doha (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback

Deep Symbolic Optimization: Reinforcement Learning for Symbolic Mathematics

Hayes, Conor F., Da Silva, Felipe Leno, Yang, Jiachen, Mundhenk, T. Nathan, Lee, Chak Shing, Pettit, Jacob F., Santiago, Claudio, Kim, Sookyung, Kim, Joanne T., Solis, Ignacio Aravena, Glatt, Ruben, Goncalves, Andre R., Ladd, Alexander, Solak, Ahmet Can, Desautels, Thomas, Faissol, Daniel, Petersen, Brenden K., Landajuela, Mikel

arXiv.org Artificial IntelligenceMay-19-2025

Deep Symbolic Optimization (DSO) is a novel computational framework that enables symbolic optimization for scientific discovery, particularly in applications involving the search for intricate symbolic structures. One notable example is equation discovery, which aims to automatically derive mathematical models expressed in symbolic form. In DSO, the discovery process is formulated as a sequential decision-making task. A generative neural network learns a probabilistic model over a vast space of candidate symbolic expressions, while reinforcement learning strategies guide the search toward the most promising regions. This approach integrates gradient-based optimization with evolutionary and local search techniques, and it incorporates in-situ constraints, domain-specific priors, and advanced policy optimization methods. The result is a robust framework capable of efficiently exploring extensive search spaces to identify interpretable and physically meaningful models. Extensive evaluations on benchmark problems have demonstrated that DSO achieves state-of-the-art performance in both accuracy and interpretability. In this chapter, we provide a comprehensive overview of the DSO framework and illustrate its transformative potential for automating symbolic optimization in scientific discovery.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2505.10762

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.05)
North America > United States > California > San Diego County > San Diego (0.04)
Europe > Austria (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Minimax learning rates for estimating binary classifiers under margin conditions

García, Jonathan, Petersen, Philipp

arXiv.org Machine LearningMay-19-2025

We study classification problems using binary estimators where the decision boundary is described by horizon functions and where the data distribution satisfies a geometric margin condition. We establish upper and lower bounds for the minimax learning rate over broad function classes with bounded Kolmogorov entropy in Lebesgue norms. A key novelty of our work is the derivation of lower bounds on the worst-case learning rates under a geometric margin condition -- a setting that is almost universally satisfied in practice but remains theoretically challenging. Moreover, our results deal with the noiseless setting, where lower bounds are particularly hard to establish. We apply our general results to classification problems with decision boundaries belonging to several function classes: for Barron-regular functions, and for Hölder-continuous functions with strong margins, we identify optimal rates close to the fast learning rates of $\mathcal{O}(n^{-1})$ for $n \in \mathbb{N}$ samples. Also for merely convex decision boundaries, in a strong margin case optimal rates near $\mathcal{O}(n^{-1/2})$ can be achieved.

artificial intelligence, machine learning, margin condition, (17 more...)

arXiv.org Machine Learning

2505.10628

Country:

Europe > Austria > Vienna (0.14)
North America > Canada > Ontario > Toronto (0.14)
North America > United States > New York (0.04)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.61)

Add feedback

$XX^{t}$ Can Be Faster

Rybin, Dmitry, Zhang, Yushun, Luo, Zhi-Quan

arXiv.org Artificial IntelligenceMay-19-2025

We present RXTX, a new algorithm for computing the product of matrix by its transpose $XX^{t}$ for $X\in \mathbb{R}^{n\times m}$. RXTX uses $5\%$ fewer multiplications and $5\%$ fewer operations (additions and multiplications) than State-of-the-Art algorithms. Note that the accelerations not only holds asymptotically for large matrices with $n \rightarrow \infty$, but also for small matrices including $n = 4$. The algorithm was discovered by combining Machine Learning-based search methods with Combinatorial Optimization.

machine learning, multiplication, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

2505.09814

Country: Asia > China (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.47)

Add feedback

Flexible Graph Similarity Computation With A Proactive Optimization Strategy

Liu, Zhouyang, Liu, Ning, Chen, Yixin, He, Jiezhong, Li, Dongsheng

arXiv.org Artificial IntelligenceMay-16-2025

Graph Edit Distance (GED) offers a principled and flexible measure of graph similarity, as it quantifies the minimum cost needed to transform one graph into another with customizable edit operation costs. Despite recent learning-based efforts to approximate GED via vector space representations, existing methods struggle with adapting to varying operation costs. Furthermore, they suffer from inefficient, reactive mapping refinements due to reliance on isolated node-level distance as guidance. To address these issues, we propose GEN, a novel learning-based approach for flexible GED approximation. GEN addresses the varying costs adaptation by integrating operation costs prior to match establishment, enabling mappings to dynamically adapt to cost variations. Furthermore, GEN introduces a proactive guidance optimization strategy that captures graph-level dependencies between matches, allowing informed matching decisions in a single step without costly iterative refinements. Extensive evaluations on real-world and synthetic datasets demonstrate that GEN achieves up to 37.8% reduction in GED approximation error and 72.7% reduction in inference time compared with state-of-the-art methods, while consistently maintaining robustness under diverse cost settings and graph sizes.

artificial intelligence, machine learning, pattern recognition, (18 more...)

arXiv.org Artificial Intelligence

2504.06533

Country:

Asia > China (0.46)
North America > United States > California (0.28)

Genre: Research Report (0.84)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
(2 more...)

Add feedback

Study and improvement of search algorithms in two-players perfect information games

Cohen-Solal, Quentin

arXiv.org Artificial IntelligenceMay-16-2025

Search algorithms in games are artificial intelligence methods for playing such games. Unfortunately, there is no study on these algorithms that evaluates the generality of their performance. We propose to address this gap in the case of two-player zero-sum games with perfect information. Furthermore, we propose a new search algorithm and we show that, for a short search time, it outperforms all studied algorithms on all games in this large experiment and that, for a medium search time, it outperforms all studied algorithms on 17 of the 22 studied games.1. Introduction Games have numerous applications, far beyond the obvious ones (the video game and board game industries) and the slightly less obvious ones (economics, defense, and also education through serious games). In fact, all computational problems can naturally be reformulated in terms of games. Game search algorithms are therefore general-purpose artificial intelligence techniques for problem-solving.

algorithm, artificial intelligence, search algorithm, (14 more...)

arXiv.org Artificial Intelligence

2505.09639

Country: Europe (0.27)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment > Games > Chess (0.94)
Leisure & Entertainment > Games > Computer Games (0.86)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)

Add feedback

Efficiently Manipulating Clutter via Learning and Search-Based Reasoning

Huang, Baichuan

arXiv.org Artificial IntelligenceMay-15-2025

This thesis presents novel algorithms to advance robotic object rearrangement, a critical task for autonomous systems in applications like warehouse automation and household assistance. Addressing challenges of high-dimensional planning, complex object interactions, and computational demands, our work integrates deep learning for interaction prediction, tree search for action sequencing, and parallelized computation for efficiency. Key contributions include the Deep Interaction Prediction Network (DIPN) for accurate push motion forecasting (over 90% accuracy), its synergistic integration with Monte Carlo Tree Search (MCTS) for effective non-prehensile object retrieval (100% completion in specific challenging scenarios), and the Parallel MCTS with Batched Simulations (PMBS) framework, which achieves substantial planning speed-up while maintaining or improving solution quality. The research further explores combining diverse manipulation primitives, validated extensively through simulated and real-world experiments.

machine learning, reinforcement learning, simulation, (19 more...)

arXiv.org Artificial Intelligence

2505.08853

Country:

North America > United States > Washington > King County > Seattle (0.04)
North America > United States > New Jersey > Middlesex County > New Brunswick (0.04)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
Europe > Netherlands > South Holland > Dordrecht (0.04)

Genre:

Workflow (1.00)
Research Report > New Finding (0.67)

Industry:

Leisure & Entertainment > Games (1.00)
Health & Medicine (1.00)
Education (0.92)

Technology:

Information Technology > Artificial Intelligence > Robots > Manipulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
(3 more...)

Add feedback

Monte Carlo Beam Search for Actor-Critic Reinforcement Learning in Continuous Control

Alzorgan, Hazim, Razi, Abolfazl

arXiv.org Artificial IntelligenceMay-15-2025

Actor-critic methods, like Twin Delayed Deep Deterministic Policy Gradient (TD3), depend on basic noise-based exploration, which can result in less than optimal policy convergence. In this study, we introduce Monte Carlo Beam Search (MCBS), a new hybrid method that combines beam search and Monte Carlo rollouts with TD3 to improve exploration and action selection. MCBS produces several candidate actions around the policy's output and assesses them through short-horizon rollouts, enabling the agent to make better-informed choices. We test MCBS across various continuous-control benchmarks, including HalfCheetah-v4, Walker2d-v5, and Swimmer-v5, showing enhanced sample efficiency and performance compared to standard TD3 and other baseline methods like SAC, PPO, and A2C. Our findings emphasize MCBS's capability to enhance policy learning through structured look-ahead search while ensuring computational efficiency. Additionally, we offer a detailed analysis of crucial hyperparameters, such as beam width and rollout depth, and explore adaptive strategies to optimize MCBS for complex control tasks. Our method shows a higher convergence rate across different environments compared to TD3, SAC, PPO, and A2C. For instance, we achieved 90% of the maximum achievable reward within around 200 thousand timesteps compared to 400 thousand timesteps for the second-best method.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2505.09029

Genre: Research Report > New Finding (1.00)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Preference Optimization for Combinatorial Optimization Problems

Pan, Mingjun, Lin, Guanquan, Luo, You-Wei, Zhu, Bin, Dai, Zhien, Sun, Lijun, Yuan, Chun

arXiv.org Artificial IntelligenceMay-14-2025

Reinforcement Learning (RL) has emerged as a powerful tool for neural combinatorial optimization, enabling models to learn heuristics that solve complex problems without requiring expert knowledge. Despite significant progress, existing RL approaches face challenges such as diminishing reward signals and inefficient exploration in vast combinatorial action spaces, leading to inefficiency. In this paper, we propose Preference Optimization, a novel method that transforms quantitative reward signals into qualitative preference signals via statistical comparison modeling, emphasizing the superiority among sampled solutions. Methodologically, by reparameterizing the reward function in terms of policy and utilizing preference models, we formulate an entropy-regularized RL objective that aligns the policy directly with preferences while avoiding intractable computations. Furthermore, we integrate local search techniques into the fine-tuning rather than post-processing to generate high-quality preference pairs, helping the policy escape local optima. Empirical results on various benchmarks, such as the Traveling Salesman Problem (TSP), the Capacitated Vehicle Routing Problem (CVRP) and the Flexible Flow Shop Problem (FFSP), demonstrate that our method significantly outperforms existing RL algorithms, achieving superior convergence efficiency and solution quality.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2505.08735

Country:

North America > United States > California > San Francisco County > San Francisco (0.04)
North America > Canada (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Transportation (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Mixed-Integer Optimization for Responsible Machine Learning

Justin, Nathan, Sun, Qingshi, Gómez, Andrés, Vayanos, Phebe

arXiv.org Machine LearningMay-12-2025

In the last few decades, Machine Learning (ML) has achieved significant success across domains ranging from healthcare, sustainability, and the social sciences, to criminal justice and finance. But its deployment in increasingly sophisticated, critical, and sensitive areas affecting individuals, the groups they belong to, and society as a whole raises critical concerns around fairness, transparency, robustness, and privacy, among others. As the complexity and scale of ML systems and of the settings in which they are deployed grow, so does the need for responsible ML methods that address these challenges while providing guaranteed performance in deployment. Mixed-integer optimization (MIO) offers a powerful framework for embedding responsible ML considerations directly into the learning process while maintaining performance. For example, it enables learning of inherently transparent models that can conveniently incorporate fairness or other domain specific constraints. This tutorial paper provides an accessible and comprehensive introduction to this topic discussing both theoretical and practical aspects. It outlines some of the core principles of responsible ML, their importance in applications, and the practical utility of MIO for building ML models that align with these principles. Through examples and mathematical formulations, it illustrates practical strategies and available tools for efficiently solving MIO problems for responsible ML. It concludes with a discussion on current limitations and open research questions, providing suggestions for future work.

artificial intelligence, machine learning, mixed-integer optimization, (14 more...)

arXiv.org Machine Learning

2505.05857

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(3 more...)

Genre: Research Report > Experimental Study (1.00)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Transportation (0.93)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(4 more...)

Add feedback