Goto

Collaborating Authors

 Search


A Heuristic Search Algorithm Using the Stability of Learning Algorithms in Certain Scenarios as the Fitness Function: An Artificial General Intelligence Engineering Approach

arXiv.org Artificial Intelligence

This paper presents a non-manual design engineering method based on heuristic search algorithm to search for candidate agents in the solution space which formed by artificial intelligence agents modeled on the base of bionics.Compared with the artificial design method represented by meta-learning and the bionics method represented by the neural architecture chip,this method is more feasible for realizing artificial general intelligence,and it has a much better interaction with cognitive neuroscience;at the same time,the engineering method is based on the theoretical hypothesis that the final learning algorithm is stable in certain scenarios,and has generalization ability in various scenarios.The paper discusses the theory preliminarily and proposes the possible correlation between the theory and the fixed-point theorem in the field of mathematics.Limited by the author's knowledge level,this correlation is proposed only as a kind of conjecture.


Decentralized Cooperative Planning for Automated Vehicles with Hierarchical Monte Carlo Tree Search

arXiv.org Artificial Intelligence

Today's automated vehicles lack the ability to cooperate implicitly with others. This work presents a Monte Carlo Tree Search (MCTS) based approach for decentralized cooperative planning using macro-actions for automated vehicles in heterogeneous environments. Based on cooperative modeling of other agents and Decoupled-UCT (a variant of MCTS), the algorithm evaluates the state-action-values of each agent in a cooperative and decentralized manner, explicitly modeling the interdependence of actions between traffic participants. Macro-actions allow for temporal extension over multiple time steps and increase the effective search depth requiring fewer iterations to plan over longer horizons. Without predefined policies for macro-actions, the algorithm simultaneously learns policies over and within macro-actions. The proposed method is evaluated under several conflict scenarios, showing that the algorithm can achieve effective cooperative planning with learned macro-actions in heterogeneous environments.


Counterexample-Guided Cartesian Abstraction Refinement for Classical Planning

Journal of Artificial Intelligence Research

Counterexample-guided abstraction refinement (CEGAR) is a method for incrementally computing abstractions of transition systems. We propose a CEGAR algorithm for computing abstraction heuristics for optimal classical planning. Starting from a coarse abstraction of the planning task, we iteratively compute an optimal abstract solution, check if and why it fails for the concrete planning task and refine the abstraction so that the same failure cannot occur in future iterations. A key ingredient of our approach is a novel class of abstractions for classical planning tasks that admits efficient and very fine-grained refinement. Since a single abstraction usually cannot capture enough details of the planning task, we also introduce two methods for producing diverse sets of heuristics within this framework, one based on goal atoms, the other based on landmarks. In order to sum their heuristic estimates admissibly we introduce a new cost partitioning algorithm called saturated cost partitioning. We show that the resulting heuristics outperform other state-of-the-art abstraction heuristics in many benchmark domains.


An Approximation Algorithm for Risk-averse Submodular Optimization

arXiv.org Artificial Intelligence

We study the problem of incorporating risk while making combinatorial decisions under uncertainty. We formulate a discrete submodular maximization problem for selecting a set using Conditional-Value-at-Risk (CVaR), a risk metric commonly used in financial analysis. While CVaR has recently been used in optimization of linear costs functions in robotics, we take the first stages towards extending this to discrete submodular optimization and provide several positive results. Specifically, we propose the Sequential Greedy Algorithm that provides an approximation guarantee on finding the maxima of the CVaR cost function under a matroidal constraint. The approximation guarantee shows that the solution produced by our algorithm is within a constant factor of the optimal and an additive term that depends on the optimal. Our analysis uses the curvature of the submodular set function, and proves that the algorithm runs in polynomial time. This formulates a number of combinatorial optimization problems that appear in robotics. We use two such problems, vehicle assignment under uncertainty for mobility-on-demand and sensor selection with failures for environmental monitoring, as case studies to demonstrate the efficacy of our formulation.


Algorithms Behind Modern Storage Systems

Communications of the ACM

Developing storage systems always presents the same challenges and factors to consider. Deciding what to optimize for has a substantial influence on the result. You can spend more time during write in order to lay out structures for more efficient reads, reserve extra space for in-place updates, facilitate faster writes, and buffer data in memory to ensure sequential write operations. It is impossible, however, to do this all at once. An ideal storage system would have the lowest read cost, lowest write cost, and no overhead.


Russian Security Service Searches Space Agency Over Suspected Treason: TASS

U.S. News

MOSCOW (Reuters) - Russia's Federal Security Service has searched a research facility controlled by the country's space agency Roskosmos over the suspected leaking of secrets about new hypersonic weapons to Western spies, the TASS news agency reported on Friday.


Learning Heuristics for Automated Reasoning through Deep Reinforcement Learning

arXiv.org Artificial Intelligence

We demonstrate how to learn efficient heuristics for automated reasoning algorithms through deep reinforcement learning. We consider search algorithms for quantified Boolean logics, that already can solve formulas of impressive size - up to 100s of thousands of variables. The main challenge is to find a representation which lends to making predictions in a scalable way. The heuristics learned through our approach significantly improve over the handwritten heuristics for several sets of formulas.


Multitask Reinforcement Learning for Zero-shot Generalization with Subtask Dependencies

arXiv.org Artificial Intelligence

We introduce a new RL problem where the agent is required to execute a given subtask graph which describes a set of subtasks and their dependency. Unlike existing multitask RL approaches that explicitly describe what the agent should do, a subtask graph in our problem only describes properties of subtasks and relationships among them, which requires the agent to perform complex reasoning to find the optimal sequence of subtasks to execute. To tackle this problem, we propose a neural subtask graph solver (NSS) which encodes the subtask graph using a recursive neural network. To overcome the difficulty of training, we propose a novel non-parametric gradient-based policy to pre-train our NSS agent. % and further finetune it through actor-critic method. The experimental results on two 2D visual domains show that our agent can perform complex reasoning to find a near-optimal way of executing the subtask graph and generalize well to the unseen subtask graphs. In addition, we compare our agent with a Monte-Carlo tree search (MCTS) method showing that (1) our method is much more efficient than MCTS and (2) combining MCTS with NSS dramatically improves the search performance.


Preference-Based Monte Carlo Tree Search

arXiv.org Artificial Intelligence

Monte Carlo tree search (MCTS) is a popular choice for solving sequential anytime problems. However, it depends on a numeric feedback signal, which can be difficult to define. Real-time MCTS is a variant which may only rarely encounter states with an explicit, extrinsic reward. To deal with such cases, the experimenter has to supply an additional numeric feedback signal in the form of a heuristic, which intrinsically guides the agent. Recent work has shown evidence that in different areas the underlying structure is ordinal and not numerical. Hence erroneous and biased heuristics are inevitable, especially in such domains. In this paper, we propose a MCTS variant which only depends on qualitative feedback, and therefore opens up new applications for MCTS. We also find indications that translating absolute into ordinal feedback may be beneficial. Using a puzzle domain, we show that our preference-based MCTS variant, wich only receives qualitative feedback, is able to reach a performance level comparable to a regular MCTS baseline, which obtains quantitative feedback.


Monte Carlo Methods for the Game Kingdomino

arXiv.org Artificial Intelligence

Kingdomino is introduced as an interesting game for studying game playing: the game is multiplayer (4 independent players per game); it has a limited game depth (13 moves per player); and it has limited but not insignificant interaction among players. Several strategies based on locally greedy players, Monte Carlo Evaluation (MCE), and Monte Carlo Tree Search (MCTS) are presented with variants. We examine a variation of UCT called progressive win bias and a playout policy (Player-greedy) focused on selecting good moves for the player. A thorough evaluation is done showing how the strategies perform and how to choose parameters given specific time constraints. The evaluation shows that surprisingly MCE is stronger than MCTS for a game like Kingdomino. All experiments use a cloud-native design, with a game server in a Docker container, and agents communicating using a REST-style JSON protocol. This enables a multi-language approach to separating the game state, the strategy implementations, and the coordination layer.