AITopics

2502.00767

Country: North America > United States (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceFeb-1-2025

Lipschitz Lifelong Monte Carlo Tree Search for Mastering Non-Stationary Tasks

Zhang, Zuyuan, Lan, Tian

Monte Carlo Tree Search (MCTS) has proven highly effective in solving complex planning tasks by balancing exploration and exploitation using Upper Confidence Bound for Trees (UCT). However, existing work have not considered MCTS-based lifelong planning, where an agent faces a non-stationary series of tasks -- e.g., with varying transition probabilities and rewards -- that are drawn sequentially throughout the operational lifetime. This paper presents LiZero for Lipschitz lifelong planning using MCTS. We propose a novel concept of adaptive UCT (aUCT) to transfer knowledge from a source task to the exploration/exploitation of a new task, depending on both the Lipschitz continuity between tasks and the confidence of knowledge in in Monte Carlo action sampling. We analyze LiZero's acceleration factor in terms of improved sampling efficiency and also develop efficient algorithms to compute aUCT in an online fashion by both data-driven and model-based approaches, whose sampling complexity and error bounds are also characterized. Experiment results show that LiZero significantly outperforms existing MCTS and lifelong learning baselines in terms of much faster convergence (3$\sim$4x) to optimal rewards. Our results highlight the potential of LiZero to advance decision-making and planning in dynamic real-world environments.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

2502.00633

Country: North America > United States (0.17)

Genre: Research Report > New Finding (0.68)

Industry:

Energy > Oil & Gas > Upstream (0.54)
Leisure & Entertainment > Games (0.46)
Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
(2 more...)

Liu, Manqing, Beam, Andrew L.

Doubly Robust Monte Carlo Tree Search

arXiv.org Machine LearningFeb-1-2025

We present Doubly Robust Monte Carlo Tree Search (DR-MCTS), a novel algorithm that integrates Doubly Robust (DR) off-policy estimation into Monte Carlo Tree Search (MCTS) to enhance sample efficiency and decision quality in complex environments. Our approach introduces a hybrid estimator that combines MCTS rollouts with DR estimation, offering theoretical guarantees of unbiasedness and variance reduction under specified conditions. Empirical evaluations in Tic-Tac-Toe and the partially observable VirtualHome environment demonstrate DR-MCTS's superior performance over standard MCTS. In Tic-Tac-Toe, DR-MCTS achieves an 88% win rate compared to a 10% win rate for standard MCTS. In compound VirtualHome tasks, DR-MCTS attains a 20.7% success rate versus 10.3% for standard MCTS. Our scaling analysis reveals that DR-MCTS exhibits better sample efficiency, notably outperforming standard MCTS with larger language models while using a smaller model. These results underscore DR-MCTS's potential for efficient decision-making in complex, real-world scenarios where sample efficiency is paramount.

artificial intelligence, estimator, planning & scheduling, (15 more...)

arXiv.org Machine Learning

2502.01672

Country: North America > United States (0.04)

Genre: Research Report > New Finding (0.66)

Industry: Leisure & Entertainment > Games > Tic-Tac-Toe (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)

Cheng, Richard, Papozov, Chavdar, Helmick, Dan, Tjersland, Mark

A Direct Semi-Exhaustive Search Method for Robust, Partial-to-Full Point Cloud Registration

Point cloud registration refers to the problem of finding the rigid transformation that aligns two given point clouds, and is crucial for many applications in robotics and computer vision. The main insight of this paper is that we can directly optimize the point cloud registration problem without correspondences by utilizing an algorithmically simple, yet computationally complex, semi-exhaustive search approach that is very well-suited for parallelization on modern GPUs. Our proposed algorithm, Direct Semi-Exhaustive Search (DSES), iterates over potential rotation matrices and efficiently computes the inlier-maximizing translation associated with each rotation. It then computes the optimal rigid transformation based on any desired distance metric by directly computing the error associated with each transformation candidate $\{R, t\}$. By leveraging the parallelism of modern GPUs, DSES outperforms state-of-the-art methods for partial-to-full point cloud registration on the simulated ModelNet40 benchmark and demonstrates high performance and robustness for pose estimation on a real-world robotics problem (https://youtu.be/q0q2-s2KSuA).

artificial intelligence, cloud registration, registration, (16 more...)

doi: 10.1109/IROS58592.2024.10801518

2502.00115

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

de Mathelin, Antoine, Cecchi, Nicolas Enrique, Deheeger, François, Mougeot, Mathilde, Vayatis, Nicolas

OneBatchPAM: A Fast and Frugal K-Medoids Algorithm

This paper proposes a novel k-medoids approximation algorithm to handle large-scale datasets with reasonable computational time and memory complexity. We develop a local-search algorithm that iteratively improves the medoid selection based on the estimation of the k-medoids objective. A single batch of size m << n provides the estimation, which reduces the required memory size and the number of pairwise dissimilarities computations to O(mn), instead of O(n^2) compared to most k-medoids baselines. We obtain theoretical results highlighting that a batch of size m = O(log(n)) is sufficient to guarantee, with strong probability, the same performance as the original local-search algorithm. Multiple experiments conducted on real datasets of various sizes and dimensions show that our algorithm provides similar performances as state-of-the-art methods such as FasterPAM and BanditPAM++ with a drastically reduced running time.

algorithm, artificial intelligence, onebatchpam, (15 more...)

2501.19285

Country:

Asia > Afghanistan > Parwan Province > Charikar (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report (0.84)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)

KBQA-o1: Agentic Knowledge Base Question Answering with Monte Carlo Tree Search

Luo, Haoran, E, Haihong, Guo, Yikai, Lin, Qika, Wu, Xiaobao, Mu, Xinyu, Liu, Wenhao, Song, Meina, Zhu, Yifan, Tuan, Luu Anh

Knowledge Base Question Answering (KBQA) aims to answer natural language questions with a large-scale structured knowledge base (KB). Despite advancements with large language models (LLMs), KBQA still faces challenges in weak KB awareness, imbalance between effectiveness and efficiency, and high reliance on annotated data. To address these challenges, we propose KBQA-o1, a novel agentic KBQA method with Monte Carlo Tree Search (MCTS). It introduces a ReAct-based agent process for stepwise logical form generation with KB environment exploration. Moreover, it employs MCTS, a heuristic search method driven by policy and reward models, to balance agentic exploration's performance and search space. With heuristic exploration, KBQA-o1 generates high-quality annotations for further improvement by incremental fine-tuning. Experimental results show that KBQA-o1 outperforms previous low-resource KBQA methods with limited annotated data, boosting Llama-3.1-8B model's GrailQA F1 performance to 78.5% compared to 48.5% of the previous sota method with GPT-3.5-turbo.

large language model, machine learning, natural language, (20 more...)

2501.18922

Country:

North America > United States > New York > New York County > New York City (0.04)
Asia > Thailand > Bangkok > Bangkok (0.04)
North America > Canada > Ontario > Toronto (0.04)
(11 more...)

Genre: Research Report > New Finding (0.48)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Node Classification and Search on the Rubik's Cube Graph with GNNs

Barro, Alessandro

This study focuses on the application of deep geometric models to solve the 3x3x3 Rubik's Cube. We begin by discussing the cube's graph representation and defining distance as the model's optimization objective. The distance approximation task is reformulated as a node classification problem, effectively addressed using Graph Neural Networks (GNNs). After training the model on a random subgraph, the predicted classes are used to construct a heuristic for $A^*$ search. We conclude with experiments comparing our heuristic to that of the DeepCubeA model.

artificial intelligence, machine learning, rubik, (14 more...)

2501.1858

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games > Rubik's Cube (0.76)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.90)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.76)

Bogensperger, Lea, Narnhofer, Dominik, Allam, Ahmed, Schindler, Konrad, Krauthammer, Michael

A Variational Perspective on Generative Protein Fitness Optimization

The goal of protein fitness optimization is to discover new protein variants with enhanced fitness for a given use. The vast search space and the sparsely populated fitness landscape, along with the discrete nature of protein sequences, pose significant challenges when trying to determine the gradient towards configurations with higher fitness. We introduce Variational Latent Generative Protein Optimization (VLGPO), a variational perspective on fitness optimization. Our method embeds protein sequences in a continuous latent space to enable efficient sampling from the fitness distribution and combines a (learned) flow matching prior over sequence mutations with a fitness predictor to guide optimization towards sequences with high fitness. VLGPO achieves state-of-the-art results on two different protein benchmarks of varying complexity. Moreover, the variational design with explicit prior and likelihood functions offers a flexible plug-and-play framework that can be easily customized to suit various protein design tasks.

artificial intelligence, machine learning, sequence, (16 more...)

2501.192

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (0.82)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Scaling Flaws of Verifier-Guided Search in Mathematical Reasoning

Yu, Fei, Li, Yingru, Wang, Benyou

Large language models (LLMs) struggle with multi-step reasoning, where inference-time scaling has emerged as a promising strategy for performance improvement. Verifier-guided search outperforms repeated sampling when sample size is limited by selecting and prioritizing valid reasoning paths. However, we identify a critical limitation: scaling flaws, prevalent across different models (Mistral 7B and DeepSeekMath 7B), benchmarks (GSM8K and MATH), and verifiers (outcome value models and process reward models). As sample size increases, verifier-guided search exhibits diminishing advantages and eventually underperforms repeated sampling. Our analysis attributes this to verifier failures, where imperfect verifiers misrank candidates and erroneously prune all valid paths. These issues are further exacerbated in challenging and out-of-distribution problems, restricting search effectiveness. To mitigate verifier failures, we explore reducing reliance on verifiers and conduct preliminary investigations using two simple methods. Our findings reveal fundamental limitations in verifier-guided search and suggest future directions.

large language model, natural language, selection, (16 more...)

2502.00271

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > Mexico > Mexico City > Mexico City (0.04)
(6 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.70)

Robot localization aided by quantum algorithms

Antero, Unai, Sierra, Basilio, Oñativia, Jon, Ruiz, Alejandra, Osaba, Eneko

Localization is a vital aspect of mobile robotics, enabling robots to navigate their environment efficiently and avoid obstacles. Without localization, mobile robots would be unable to determine their position and orientation, making it challenging to plan a path or make informed decisions about their movement (Olson [2000]). Localization allows mobile robots to create an internal map of their environment, which is essential for tasks such as surveying, manipulation, inspection, and delivery (Huang and Lin [2023]). In fact, localization is what enables mobile robots to perform tasks autonomously, making informed decisions about their actions and movements without human intervention. The quality of localization is heavily dependent on the generation of accurate maps, which is a computationally intensive task. Probabilistic localization methods, such as the Adaptive-Monte Carlo localization (AMCL) algorithm, have been widely used in mobile robotics due to their accuracy and robustness (Kristensen and Jensfelt [2003]). However, these methods can be computationally demanding, especially when dealing with large maps or high-resolution sensor data. AMCL, in particular, uses a combination of sensor data and prior map knowledge to determine the probable location of a robot on a given map, but its computation complexity is proportional to the area of the grid of the map (Alshikh Khalil and Hatem [2022]). Recently, the integration of light detection and ranging (LiDAR) sensors has improved the accuracy of localization methods, but the computational requirements remain a challenge (Huang and Lin [2023]).

algorithm, artificial intelligence, localization, (15 more...)

2502.00077

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.04)
Europe > Switzerland > Basel-City > Basel (0.04)
Europe > Spain > Basque Country (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.95)