AITopics | Search

Collaborating Authors

Search

"Search is a problem-solving technique that systematically explores a space of problem states, i.e., successive and alternative stages in the problem-solving process. Examples of problem states might include the different board configurations in a game or intermediate steps in a reasoning process. This space of alternative solutions is then searched to find an answer. Newell and Simon (1976) have argued that this is the essential basis of human problem solving. Indeed, when a chess player examines the effects of different moves or a doctor considers a number of alternative diagnoses, they are searching among alternatives."
– from Section 1.2 of Chapter One of George F. Luger's textbook, Artificial Intelligence: Structures and Strategies for Complex Problem Solving, 5th Edition (Addison-Wesley; 2005).

News Overviews Instructional Materials AI-Alerts Classics

Neural-guided, Bidirectional Program Search for Abstraction and Reasoning

Alford, Simon, Gandhi, Anshula, Rangamani, Akshay, Banburski, Andrzej, Wang, Tony, Dandekar, Sylee, Chin, John, Poggio, Tomaso, Chin, Peter

arXiv.org Artificial IntelligenceOct-26-2021

One of the challenges facing artificial intelligence research today is designing systems capable of utilizing systematic reasoning to generalize to new tasks. The Abstraction and Reasoning Corpus (ARC) measures such a capability through a set of visual reasoning tasks. In this paper we report incremental progress on ARC and lay the foundations for two approaches to abstraction and reasoning not based in brute-force search. We first apply an existing program synthesis system called DreamCoder to create symbolic abstractions out of tasks solved so far, and show how it enables solving of progressively more challenging ARC tasks. Second, we design a reasoning algorithm motivated by the way humans approach ARC. Our algorithm constructs a search graph and reasons over this graph structure to discover task solutions. More specifically, we extend existing execution-guided program synthesis approaches with deductive reasoning based on function inverse semantics to enable a neural-guided bidirectional search algorithm. We demonstrate the effectiveness of the algorithm on three domains: ARC, 24-Game tasks, and a 'double-and-add' arithmetic puzzle.

opération, reasoning, synthesis, (17 more...)

arXiv.org Artificial Intelligence

2110.11536

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)

Genre: Research Report (0.40)

Industry: Education (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)

Add feedback

Learning Collaborative Policies to Solve NP-hard Routing Problems

Kim, Minsu, Park, Jinkyoo, Kim, Joungho

arXiv.org Machine LearningOct-26-2021

Recently, deep reinforcement learning (DRL) frameworks have shown potential for solving NP-hard routing problems such as the traveling salesman problem (TSP) without problem-specific expert knowledge. Although DRL can be used to solve complex problems, DRL frameworks still struggle to compete with state-of-the-art heuristics showing a substantial performance gap. This paper proposes a novel hierarchical problem-solving strategy, termed learning collaborative policies (LCP), which can effectively find the near-optimum solution using two iterative DRL policies: the seeder and reviser. The seeder generates as diversified candidate solutions as possible (seeds) while being dedicated to exploring over the full combinatorial action space (i.e., sequence of assignment action). To this end, we train the seeder's policy using a simple yet effective entropy regularization reward to encourage the seeder to find diverse solutions. On the other hand, the reviser modifies each candidate solution generated by the seeder; it partitions the full trajectory into sub-tours and simultaneously revises each sub-tour to minimize its traveling distance. Thus, the reviser is trained to improve the candidate solution's quality, focusing on the reduced solution space (which is beneficial for exploitation). Extensive experiments demonstrate that the proposed two-policies collaboration scheme improves over single-policy DRL framework on various NP-hard routing problems, including TSP, prize collecting TSP (PCTSP), and capacitated vehicle routing problem (CVRP).

node, reviser, tsp, (16 more...)

arXiv.org Machine Learning

2110.13987

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Transportation (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.81)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)
(2 more...)

Add feedback

Optimal Auction Design for the Gradual Procurement of Strategic Service Provider Agents

Farhadi, Farzaneh, Chli, Maria, Jennings, Nicholas R.

arXiv.org Artificial IntelligenceOct-25-2021

We consider an outsourcing problem where a software agent procures multiple services from providers with uncertain reliabilities to complete a computational task before a strict deadline. The service consumer requires a procurement strategy that achieves the optimal balance between success probability and invocation cost. However, the service providers are self-interested and may misrepresent their private cost information if it benefits them. For such settings, we design a novel procurement auction that provides the consumer with the highest possible revenue, while giving sufficient incentives to providers to tell the truth about their costs. This auction creates a contingent plan for gradual service procurement that suggests recruiting a new provider only when the success probability of the already hired providers drops below a time-dependent threshold. To make this auction incentive compatible, we propose a novel weighted threshold payment scheme which pays the minimum among all truthful mechanisms. Using the weighted payment scheme, we also design a low-complexity near-optimal auction that reduces the computational complexity of the optimal mechanism by 99% with only marginal performance loss (less than 1%). We demonstrate the effectiveness and strength of our proposed auctions through both game theoretical and numerical analysis. The experiment results confirm that the proposed auctions exhibit 59% improvement in performance over the current state-of-the-art, by increasing success probability up to 79% and reducing invocation cost by up to 11%.

auction, consumer, provider, (17 more...)

arXiv.org Artificial Intelligence

2110.12846

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(4 more...)

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
(3 more...)

Add feedback

Learning Stochastic Shortest Path with Linear Function Approximation

Min, Yifei, He, Jiafan, Wang, Tianhao, Gu, Quanquan

arXiv.org Machine LearningOct-25-2021

The Stochastic Shortest Path (SSP) model refers to a type of reinforcement learning (RL) problems where an agent repeatedly interacts with a stochastic environment and aims to reach some specific goal state while minimizing the cumulative cost. Compared with other popular RL settings such as episodic and infinite-horizon Markov Decision Processes (MDPs), the horizon length in SSP is random, varies across different policies, and can potentially be infinite because the interaction only stops when arriving at the goal state. Therefore, the SSP model includes both episodic and infinitehorizon MDPs as special cases, and is comparably more general and of broader applicability. In particular, many goal-oriented real-world problems fit better into the SSP model, such as navigation and GO game (Andrychowicz et al., 2017; Nasiriany et al., 2019). In recent years, there emerges a line of works on developing efficient algorithms and the corresponding analyses for learning SSP. Most of them consider the episodic setting, where the interaction between the agent and the environment proceeds in K episodes (Cohen et al., 2020; Tarbouriech et al., 2020a). For tabular SSP models where the sizes of the action and state space are finite, Cohen et al. (2021) developed a finite-horizon reduction algorithm that achieves the minimax

algorithm, init, probability, (15 more...)

arXiv.org Machine Learning

2110.12727

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > United States > Connecticut > New Haven County > New Haven (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.41)
(2 more...)

Add feedback

Scaling Neural Program Synthesis with Distribution-based Search

Fijalkow, Nathanaël, Lagarde, Guillaume, Matricon, Théo, Ellis, Kevin, Ohlmann, Pierre, Potta, Akarsh

arXiv.org Artificial IntelligenceOct-24-2021

We consider the problem of automatically constructing computer programs from input-output examples. We investigate how to augment probabilistic and neural program synthesis methods with new search algorithms, proposing a framework called distribution-based search. Within this framework, we introduce two new search algorithms: Heap Search, an enumerative method, and SQRT Sampling, a probabilistic method. We prove certain optimality guarantees for both methods, show how they integrate with probabilistic and neural techniques, and demonstrate how they can operate at scale across parallel compute environments. Collectively these findings offer theoretical and applied studies of search algorithms for program synthesis that integrate with recent developments in machine-learned program synthesizers.

algorithm, probability, search algorithm, (14 more...)

arXiv.org Artificial Intelligence

2110.12485

Country:

Europe > France > Île-de-France > Paris > Paris (0.14)
North America > United States > Florida > Hillsborough County > University (0.04)
Europe > United Kingdom (0.04)
(2 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)

Add feedback

C-Planning: An Automatic Curriculum for Learning Goal-Reaching Tasks

Zhang, Tianjun, Eysenbach, Benjamin, Salakhutdinov, Ruslan, Levine, Sergey, Gonzalez, Joseph E.

arXiv.org Artificial IntelligenceOct-22-2021

Goal-conditioned reinforcement learning (RL) can solve tasks in a wide range of domains, including navigation and manipulation, but learning to reach distant goals remains a central challenge to the field. Learning to reach such goals is particularly hard without any offline data, expert demonstrations, and reward shaping. In this paper, we propose an algorithm to solve the distant goal-reaching task by using search at training time to automatically generate a curriculum of intermediate states. Our algorithm, Classifier-Planning (C-Planning), frames the learning of the goal-conditioned policies as expectation maximization: the E-step corresponds to planning an optimal sequence of waypoints using graph search, while the M-step aims to learn a goal-conditioned policy to reach those waypoints. Unlike prior methods that combine goal-conditioned RL with graph search, ours performs search only during training and not testing, significantly decreasing the compute costs of deploying the learned policy. Empirically, we demonstrate that our method is more sample efficient than prior methods. Moreover, it is able to solve very long horizons manipulation and navigation tasks, tasks that prior goal-conditioned methods and methods based on graph search fail to solve.

agent, c-planning, waypoint, (15 more...)

arXiv.org Artificial Intelligence

2110.1208

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.89)

Add feedback

Optimal Any-Angle Pathfinding on a Sphere

Rospotniuk, Volodymyr, Small, Rupert

Journal of Artificial Intelligence ResearchOct-21-2021

Pathfinding in Euclidean space is a common problem faced in robotics and computer games. For long-distance navigation on the surface of the earth or in outer space however, approximating the geometry as Euclidean can be insufficient for real-world applications such as the navigation of spacecraft, aeroplanes, drones and ships. This article describes an any-angle pathfinding algorithm for calculating the shortest path between point pairs over the surface of a sphere. Introducing several novel adaptations, it is shown that Anya as described by Harabor & Grastien for Euclidean space can be extended to Spherical geometry. There, where the shortest-distance line between coordinates is defined instead by a great-circle path, the optimal solution is typically a curved line in Euclidean space. In addition the turning points for optimal paths in Spherical geometry are not necessarily corner points as they are in Euclidean space, as will be shown, making further substantial adaptations to Anya necessary. Spherical Anya returns the optimal path on the sphere, given these different properties of world maps defined in Spherical geometry. It preserves all primary benefits of Anya in Euclidean geometry, namely the Spherical Anya algorithm always returns an optimal path on a sphere and does so entirely on-line, without any preprocessing or large memory overheads. Performance benchmarks are provided for several game maps including Starcraft and Warcraft III as well as for sea navigation on Earth using the NOAA bathymetric dataset. Always returning the shorter path compared with the Euclidean approximation yielded by Anya, Spherical Anya is shown to be faster than Anya for the majority of sea routes and slower for Game Maps and Random Maps.

anya, projection, spherical anya, (14 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.12483

AI Access Foundation

12483

Journal of Artificial Intelligence Research

Country:

South America > Argentina (0.04)
North America > United States > New York (0.04)
Europe > Sweden > Vaestra Goetaland > Gothenburg (0.04)
(2 more...)

Industry: Leisure & Entertainment > Games > Computer Games (0.87)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.68)

Add feedback

Part-X: A Family of Stochastic Algorithms for Search-Based Test Generation with Probabilistic Guarantees

Pedrielli, Giulia, Khandait, Tanmay, Chotaliya, Surdeep, Thibeault, Quinn, Huang, Hao, Castillo-Effen, Mauricio, Fainekos, Georgios

arXiv.org Artificial IntelligenceOct-20-2021

Requirements driven search-based testing (also known as falsification) has proven to be a practical and effective method for discovering erroneous behaviors in Cyber-Physical Systems. Despite the constant improvements on the performance and applicability of falsification methods, they all share a common characteristic. Namely, they are best-effort methods which do not provide any guarantees on the absence of erroneous behaviors (falsifiers) when the testing budget is exhausted. The absence of finite time guarantees is a major limitation which prevents falsification methods from being utilized in certification procedures. In this paper, we address the finite-time guarantees problem by developing a new stochastic algorithm. Our proposed algorithm not only estimates (bounds) the probability that falsifying behaviors exist, but also it identifies the regions where these falsifying behaviors may occur. We demonstrate the applicability of our approach on standard benchmark functions from the optimization literature and on the F16 benchmark problem.

algorithm, iteration, subregion, (17 more...)

arXiv.org Artificial Intelligence

2110.10729

Country:

North America > United States > Arizona > Maricopa County > Tempe (0.04)
North America > United States > Virginia > Arlington County > Arlington (0.04)
Asia > Taiwan (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Transportation (1.00)
Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Towards Optimal Correlational Object Search

Zheng, Kaiyu, Chitnis, Rohan, Sung, Yoonchang, Konidaris, George, Tellex, Stefanie

arXiv.org Artificial IntelligenceOct-19-2021

In realistic applications of object search, robots will need to locate target objects in complex environments while coping with unreliable sensors, especially for small or hard-to-detect objects. In such settings, correlational information can be valuable for planning efficiently: when looking for a fork, the robot could start by locating the easier-to-detect refrigerator, since forks would probably be found nearby. Previous approaches to object search with correlational information typically resort to ad-hoc or greedy search strategies. In this paper, we propose the Correlational Object Search POMDP (COS-POMDP), which can be solved to produce search strategies that use correlational information. COS-POMDPs contain a correlation-based observation model that allows us to avoid the exponential blow-up of maintaining a joint belief about all objects, while preserving the optimal solution to this naive, exponential POMDP formulation. We propose a hierarchical planning algorithm to scale up COS-POMDP for practical domains. We conduct experiments using AI2-THOR, a realistic simulator of household environments, as well as YOLOv5, a widely-used object detector. Our results show that, particularly for hard-to-detect objects, such as scrub brush and remote control, our method offers the most robust performance compared to baselines that ignore correlations as well as a greedy, next-best view approach.

co-pomdp, information, robot, (14 more...)

arXiv.org Artificial Intelligence

2110.09991

Country:

North America > United States > Rhode Island > Providence County > Providence (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Optimal randomized classification trees

Blanquero, Rafael, Carrizosa, Emilio, Molero-Río, Cristina, Morales, Dolores Romero

arXiv.org Machine LearningOct-19-2021

Classification and Regression Trees (CARTs) are off-the-shelf techniques in modern Statistics and Machine Learning. CARTs are traditionally built by means of a greedy procedure, sequentially deciding the splitting predictor variable(s) and the associated threshold. This greedy approach trains trees very fast, but, by its nature, their classification accuracy may not be competitive against other state-of-the-art procedures. Moreover, controlling critical issues, such as the misclassification rates in each of the classes, is difficult. To address these shortcomings, optimal decision trees have been recently proposed in the literature, which use discrete decision variables to model the path each observation will follow in the tree. Instead, we propose a new approach based on continuous optimization. Our classifier can be seen as a randomized tree, since at each node of the decision tree a random decision is made. The computational experience reported demonstrates the good performance of our procedure.

orct, predictor variable, probability, (16 more...)

arXiv.org Machine Learning

doi: 10.1016/j.cor.2021.105281

2110.11952

Country:

North America > United States > Wisconsin (0.04)
North America > United States > New York (0.04)
North America > United States > Massachusetts (0.04)
(4 more...)

Genre: Research Report > Promising Solution (0.34)

Industry: Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback