AITopics | Search

Collaborating Authors

Search

"Search is a problem-solving technique that systematically explores a space of problem states, i.e., successive and alternative stages in the problem-solving process. Examples of problem states might include the different board configurations in a game or intermediate steps in a reasoning process. This space of alternative solutions is then searched to find an answer. Newell and Simon (1976) have argued that this is the essential basis of human problem solving. Indeed, when a chess player examines the effects of different moves or a doctor considers a number of alternative diagnoses, they are searching among alternatives."
– from Section 1.2 of Chapter One of George F. Luger's textbook, Artificial Intelligence: Structures and Strategies for Complex Problem Solving, 5th Edition (Addison-Wesley; 2005).

News Overviews Instructional Materials AI-Alerts Classics

FUSELOC: Fusing Global and Local Descriptors to Disambiguate 2D-3D Matching in Visual Localization

Nguyen, Son Tung, Fontan, Alejandro, Milford, Michael, Fischer, Tobias

arXiv.org Artificial IntelligenceAug-21-2024

Hierarchical methods represent state-of-the-art visual localization, optimizing search efficiency by using global descriptors to focus on relevant map regions. However, this state-of-the-art performance comes at the cost of substantial memory requirements, as all database images must be stored for feature matching. In contrast, direct 2D-3D matching algorithms require significantly less memory but suffer from lower accuracy due to the larger and more ambiguous search space. We address this ambiguity by fusing local and global descriptors using a weighted average operator within a 2D-3D search framework. This fusion rearranges the local descriptor space such that geographically nearby local descriptors are closer in the feature space according to the global descriptors. Therefore, the number of irrelevant competing descriptors decreases, specifically if they are geographically distant, thereby increasing the likelihood of correctly matching a query descriptor. We consistently improve the accuracy over local-only systems and achieve performance close to hierarchical methods while halving memory requirements. Extensive experiments using various state-of-the-art local and global descriptors across four different datasets demonstrate the effectiveness of our approach. For the first time, our approach enables direct matching algorithms to benefit from global descriptors while maintaining memory efficiency. The code for this paper will be published at \href{https://github.com/sontung/descriptor-disambiguation}{github.com/sontung/descriptor-disambiguation}.

comput, descriptor, global descriptor, (15 more...)

arXiv.org Artificial Intelligence

2408.12037

Country:

Oceania > Australia > Queensland > Brisbane (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.94)
Information Technology > Hardware > Memory (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.35)

Add feedback

Quantifying Behavioural Distance Between Mathematical Expressions

Mežnar, Sebastian, Džeroski, Sašo, Todorovski, Ljupčo

arXiv.org Artificial IntelligenceAug-21-2024

Existing symbolic regression methods organize the space of candidate mathematical expressions primarily based on their syntactic, structural similarity. However, this approach overlooks crucial equivalences between expressions that arise from mathematical symmetries, such as commutativity, associativity, and distribution laws for arithmetic operations. Consequently, expressions with similar errors on a given data set are apart from each other in the search space. This leads to a rough error landscape in the search space that efficient local, gradient-based methods cannot explore. This paper proposes and implements a measure of a behavioral distance, BED, that clusters together expressions with similar errors. The experimental results show that the stochastic method for calculating BED achieves consistency with a modest number of sampled values for evaluating the expressions. This leads to computational efficiency comparable to the tree-based syntactic distance. Our findings also reveal that BED significantly improves the smoothness of the error landscape in the search space for symbolic regression.

behavioural distance, expression, smoothness, (12 more...)

arXiv.org Artificial Intelligence

2408.11515

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Slovenia > Central Slovenia > Municipality of Ljubljana > Ljubljana (0.05)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Add feedback

Efficient Exploration and Discriminative World Model Learning with an Object-Centric Abstraction

GX-Chen, Anthony, Marino, Kenneth, Fergus, Rob

arXiv.org Artificial IntelligenceAug-21-2024

In the face of difficult exploration problems in reinforcement learning, we study whether giving an agent an object-centric mapping (describing a set of items and their attributes) allow for more efficient learning. We found this problem is best solved hierarchically by modelling items at a higher level of state abstraction to pixels, and attribute change at a higher level of temporal abstraction to primitive actions. This abstraction simplifies the transition dynamic by making specific future states easier to predict. We make use of this to propose a fully model-based algorithm that learns a discriminative world model, plans to explore efficiently with only a count-based intrinsic reward, and can subsequently plan to reach any discovered (abstract) states. We demonstrate the model's ability to (i) efficiently solve single tasks, (ii) transfer zero-shot and few-shot across item types and environments, and (iii) plan across long horizons. Across a suite of 2D crafting and MiniHack environments, we empirically show our model significantly out-performs state-of-the-art low-level methods (without abstraction), as well as performant model-free and model-based methods using the same abstraction. Finally, we show how to reinforce learn low level object-perturbing policies, as well as supervise learn the object mapping itself.

ab-mdp, inventory, transition, (13 more...)

arXiv.org Artificial Intelligence

2408.11816

Country:

North America > United States > New York > Richmond County > New York City (0.04)
North America > United States > New York > Queens County > New York City (0.04)
North America > United States > New York > New York County > New York City (0.04)
(5 more...)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Strategist: Learning Strategic Skills by LLMs via Bi-Level Tree Search

Light, Jonathan, Cai, Min, Chen, Weiqin, Wang, Guanzhi, Chen, Xiusi, Cheng, Wei, Yue, Yisong, Hu, Ziniu

arXiv.org Artificial IntelligenceAug-20-2024

In this paper, we propose a new method Strategist that utilizes LLMs to acquire new skills for playing multi-agent games through a self-improvement process. Our method gathers quality feedback through self-play simulations with Monte Carlo tree search and LLM-based reflection, which can then be used to learn high-level strategic skills such as how to evaluate states that guide the low-level execution.We showcase how our method can be used in both action planning and dialogue generation in the context of games, achieving good performance on both tasks. Specifically, we demonstrate that our method can help train agents with better performance than both traditional reinforcement learning-based approaches and other LLM-based skill learning approaches in games including the Game of Pure Strategy (GOPS) and The Resistance: Avalon.

llm, player 1, probability, (15 more...)

arXiv.org Artificial Intelligence

2408.10635

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Deterministic Policy Gradient Primal-Dual Methods for Continuous-Space Constrained MDPs

Rozada, Sergio, Ding, Dongsheng, Marques, Antonio G., Ribeiro, Alejandro

arXiv.org Artificial IntelligenceAug-19-2024

We study the problem of computing deterministic optimal policies for constrained Markov decision processes (MDPs) with continuous state and action spaces, which are widely encountered in constrained dynamical systems. Designing deterministic policy gradient methods in continuous state and action spaces is particularly challenging due to the lack of enumerable state-action pairs and the adoption of deterministic policies, hindering the application of existing policy gradient methods for constrained MDPs. To this end, we develop a deterministic policy gradient primal-dual method to find an optimal deterministic policy with non-asymptotic convergence. Specifically, we leverage regularization of the Lagrangian of the constrained MDP to propose a deterministic policy gradient primal-dual (D-PGPD) algorithm that updates the deterministic policy via a quadratic-regularized gradient ascent step and the dual variable via a quadratic-regularized gradient descent step. We prove that the primal-dual iterates of D-PGPD converge at a sub-linear rate to an optimal regularized primal-dual pair. We instantiate D-PGPD with function approximation and prove that the primal-dual iterates of D-PGPD converge at a sub-linear rate to an optimal regularized primal-dual pair, up to a function approximation error. Furthermore, we demonstrate the effectiveness of our method in two continuous control problems: robot navigation and fluid control. To the best of our knowledge, this appears to be the first work that proposes a deterministic policy search method for continuous-space constrained MDPs.

deterministic policy, iterate, value function, (14 more...)

arXiv.org Artificial Intelligence

2408.10015

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Europe > Spain > Galicia > Madrid (0.04)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.34)

Add feedback

Enhancing Reinforcement Learning Through Guided Search

Arjonilla, Jérôme, Saffidine, Abdallah, Cazenave, Tristan

arXiv.org Artificial IntelligenceAug-19-2024

With the aim of improving performance in Markov Decision Problem in an Off-Policy setting, we suggest taking inspiration from what is done in Offline Reinforcement Learning (RL). In Offline RL, it is a common practice during policy learning to maintain proximity to a reference policy to mitigate uncertainty, reduce potential policy errors, and help improve performance. We find ourselves in a different setting, yet it raises questions about whether a similar concept can be applied to enhance performance ie, whether it is possible to find a guiding policy capable of contributing to performance improvement, and how to incorporate it into our RL agent. Our attention is particularly focused on algorithms based on Monte Carlo Tree Search (MCTS) as a guide.MCTS renowned for its state-of-the-art capabilities across various domains, catches our interest due to its ability to converge to equilibrium in single-player and two-player contexts. By harnessing the power of MCTS as a guide for our RL agent, we observed a significant performance improvement, surpassing the outcomes achieved by utilizing each method in isolation. Our experiments were carried out on the Atari 100k benchmark.

agent, algorithm, reinforcement, (16 more...)

arXiv.org Artificial Intelligence

2408.10113

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Germany > Brandenburg > Potsdam (0.04)
(2 more...)

Genre:

Research Report (1.00)
Overview (0.93)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Branch and Bound to Assess Stability of Regression Coefficients in Uncertain Models

Knaeble, Brian, Hughes, R. Mitchell, Rudolph, George, Abramson, Mark A., Razo, Daniel

arXiv.org Artificial IntelligenceAug-18-2024

It can be difficult to interpret a coefficient of an uncertain model. A slope coefficient of a regression model may change as covariates are added or removed from the model. In the context of high-dimensional data, there are too many model extensions to check. However, as we show here, it is possible to efficiently search, with a branch and bound algorithm, for maximum and minimum values of that adjusted slope coefficient over a discrete space of regularized regression models. Here we introduce our algorithm, along with supporting mathematical results, an example application, and a link to our computer code, to help researchers summarize high-dimensional data and assess the stability of regression coefficients in uncertain models.

algorithm, coefficient, covariate, (17 more...)

arXiv.org Artificial Intelligence

2408.09634

Country:

North America > United States > Utah > Utah County > Orem (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > Experimental Study (0.94)

Industry:

Health & Medicine > Consumer Health (1.00)
Education > Health & Safety > School Nutrition (1.00)
Health & Medicine > Therapeutic Area (0.94)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Learning Based Toolpath Planner on Diverse Graphs for 3D Printing

Huang, Yuming, Guo, Yuhu, Su, Renbo, Han, Xingjian, Ding, Junhao, Zhang, Tianyu, Liu, Tao, Wang, Weiming, Fang, Guoxin, Song, Xu, Whiting, Emily, Wang, Charlie C. L.

arXiv.org Artificial IntelligenceAug-17-2024

This paper presents a learning based planner for computing optimized 3D printing toolpaths on prescribed graphs, the challenges of which include the varying graph structures on different models and the large scale of nodes & edges on a graph. We adopt an on-the-fly strategy to tackle these challenges, formulating the planner as a Deep Q-Network (DQN) based optimizer to decide the next `best' node to visit. We construct the state spaces by the Local Search Graph (LSG) centered at different nodes on a graph, which is encoded by a carefully designed algorithm so that LSGs in similar configurations can be identified to re-use the earlier learned DQN priors for accelerating the computation of toolpath planning. Our method can cover different 3D printing applications by defining their corresponding reward functions. Toolpath planning problems in wire-frame printing, continuous fiber printing, and metallic printing are selected to demonstrate its generality. The performance of our planner has been verified by testing the resultant toolpaths in physical experiments. By using our planner, wire-frame models with up to 4.2k struts can be successfully printed, up to 93.3% of sharp turns on continuous fiber toolpaths can be avoided, and the thermal distortion in metallic printing can be reduced by 24.9%.

node, printing, toolpath, (17 more...)

arXiv.org Artificial Intelligence

2408.09198

Country:

Europe > United Kingdom > England > Greater Manchester > Manchester (0.04)
Asia > China > Hong Kong (0.04)
North America > United States (0.04)

Genre: Research Report (0.82)

Industry: Machinery > Industrial Machinery (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Efficient Task Transfer for HLS DSE

Ding, Zijian, Sohrabizadeh, Atefeh, Li, Weikai, Qin, Zongyue, Sun, Yizhou, Cong, Jason

arXiv.org Artificial IntelligenceAug-16-2024

There have been several recent works proposed to utilize model-based optimization methods to improve the productivity of using high-level synthesis (HLS) to design domain-specific architectures. They would replace the time-consuming performance estimation or simulation of design with a proxy model, and automatically insert pragmas to guide hardware optimizations. In this work, we address the challenges associated with high-level synthesis (HLS) design space exploration (DSE) through the evolving landscape of HLS tools. As these tools develop, the quality of results (QoR) from synthesis can vary significantly, complicating the maintenance of optimal design strategies across different toolchains. We introduce Active-CEM, a task transfer learning scheme that leverages a model-based explorer designed to adapt efficiently to changes in toolchains. This approach optimizes sample efficiency by identifying high-quality design configurations under a new toolchain without requiring extensive re-evaluation. We further refine our methodology by incorporating toolchain-invariant modeling. This allows us to predict QoR changes more accurately despite shifts in the black-box implementation of the toolchains. Experiment results on the HLSyn benchmark transitioning to new toolchain show an average performance improvement of 1.58$\times$ compared to AutoDSE and a 1.2$\times$ improvement over HARP, while also increasing the sample efficiency by 5.26$\times$, and reducing the runtime by 2.7$\times$.

design space, efficiency, toolchain, (17 more...)

arXiv.org Artificial Intelligence

2408.1327

Country: North America > United States > California > Los Angeles County > Los Angeles (0.28)

Genre: Research Report > Promising Solution (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.67)

Add feedback

On the Completeness of Conflict-Based Search: Temporally-Relative Duplicate Pruning

Walker, Thayne T, Sturtevant, Nathan R

arXiv.org Artificial IntelligenceAug-16-2024

Conflict-Based Search (CBS) algorithm for the multi-agent pathfinding (MAPF) problem is that it is incomplete for problems which have no solution; if no mitigating procedure is run in parallel, CBS will run forever when given an unsolvable problem instance. In this work, we introduce Temporally-Relative Duplicate Pruning (TRDP), a technique for duplicate detection and removal in both classic and continuous-time MAPF domains. TRDP is a simple procedure which closes the long-standing theoretic loophole of incompleteness for CBS by detecting and avoiding the expansion of duplicate states. TRDP is shown both theoretically and empirically to ensure termination without a significant impact on runtime in the majority of problem instances. In certain cases, TRDP is shown to increase performance significantly

agent, algorithm, constraint, (16 more...)

arXiv.org Artificial Intelligence

2408.09028

Country:

North America > Canada > Alberta (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.94)

Add feedback