AITopics | Search

Collaborating Authors

Search

"Search is a problem-solving technique that systematically explores a space of problem states, i.e., successive and alternative stages in the problem-solving process. Examples of problem states might include the different board configurations in a game or intermediate steps in a reasoning process. This space of alternative solutions is then searched to find an answer. Newell and Simon (1976) have argued that this is the essential basis of human problem solving. Indeed, when a chess player examines the effects of different moves or a doctor considers a number of alternative diagnoses, they are searching among alternatives."
– from Section 1.2 of Chapter One of George F. Luger's textbook, Artificial Intelligence: Structures and Strategies for Complex Problem Solving, 5th Edition (Addison-Wesley; 2005).

News Overviews Instructional Materials AI-Alerts Classics

Policy Search with Rare Significant Events: Choosing the Right Partner to Cooperate with

Ecoffet, Paul, Fontbonne, Nicolas, André, Jean-Baptiste, Bredeche, Nicolas

arXiv.org Artificial IntelligenceMar-11-2021

This paper focuses on a class of reinforcement learning problems where significant events are rare and limited to a single positive reward per episode. A typical example is that of an agent who has to choose a partner to cooperate with, while a large number of partners are simply not interested in cooperating, regardless of what the agent has to offer. We address this problem in a continuous state and action space with two different kinds of search methods: a gradient policy search method and a direct policy search method using an evolution strategy. We show that when significant events are rare, gradient information is also scarce, making it difficult for policy gradient search methods to find an optimal policy, with or without a deep neural architecture. On the other hand, we show that direct policy search methods are invariant to the rarity of significant events, which is yet another confirmation of the unique role evolutionary algorithms has to play as a reinforcement learning method.

agent, algorithm, focal agent, (15 more...)

arXiv.org Artificial Intelligence

2103.06846

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > France > Île-de-France > Paris > Paris (0.04)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
(2 more...)

Add feedback

Adapting User Interfaces with Model-based Reinforcement Learning

Todi, Kashyap, Bailly, Gilles, Leiva, Luis A., Oulasvirta, Antti

arXiv.org Artificial IntelligenceMar-11-2021

Adapting an interface requires taking into account both the positive and negative effects that changes may have on the user. A carelessly picked adaptation may impose high costs to the user -- for example, due to surprise or relearning effort -- or "trap" the process to a suboptimal design immaturely. However, effects on users are hard to predict as they depend on factors that are latent and evolve over the course of interaction. We propose a novel approach for adaptive user interfaces that yields a conservative adaptation policy: It finds beneficial changes when there are such and avoids changes when there are none. Our model-based reinforcement learning method plans sequences of adaptations and consults predictive HCI models to estimate their effects. We present empirical and simulation results from the case of adaptive menus, showing that the method outperforms both a non-adaptive and a frequency-based policy.

adaptation, application, proceedings, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3411764.3445497

2103.06807

Country:

Europe > Austria > Vienna (0.14)
North America > United States > New York > New York County > New York City (0.05)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture > Yokohama (0.05)
(16 more...)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(2 more...)

Add feedback

Monte Carlo Tree Search: A Review of Recent Modifications and Applications

Świechowski, Maciej, Godlewski, Konrad, Sawicki, Bartosz, Mańdziuk, Jacek

arXiv.org Artificial IntelligenceMar-9-2021

Monte Carlo Tree Search (MCTS) is a decision-making algorithm that consists in searching large combinatorial spaces represented by trees. In such trees, nodes denote states, also referred to as configurations of the problem, whereas edges denote transitions (actions) from one state to another. MCTS has been originally proposed in the work by Kocsis and Szepesvári (2006) and by Coulom (2006), as an algorithm for making computer players in Go. It was quickly called a major breakthrough (Gelly et al., 2012) as it allowed for a leap from 14 kyu, which is an average amateur level, to 5 dan, which is considered an advanced level but not professional yet. Before MCTS, bots for combinatorial games had been using various modifications of the min-max alpha-beta pruning algorithm (Junghanns, 1998) such as MTD(f) (Plaat, 2014) and hand-crafted heuristics. In contrast to them, MCTS algorithm is at its core aheuristic, which means that no additional knowledge is required other than just rules of a game (or a problem, generally speaking). However, it is possible to take advantage of heuristics and include them in the MCTS approach to make it more efficient and improve its convergence. Moreover, the given problem often tends to be so complex, from the combinatorial point of view, that some form of external help, e.g.

algorithm, computational intelligence, monte carlo tree search, (12 more...)

arXiv.org Artificial Intelligence

2103.04931

Country:

Europe > Poland > Masovia Province > Warsaw (0.04)
Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
South America > Argentina > Pampas > Buenos Aires F.D. > Buenos Aires (0.04)
(9 more...)

Genre:

Research Report > Promising Solution (1.00)
Research Report > New Finding (0.87)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Energy (0.92)
Government (0.92)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Games (1.00)
(3 more...)

Add feedback

Google's Model Search: An Open Source Platform for Finding Optimal ML Models

#artificialintelligenceMar-8-2021, 20:55:22 GMT

The above questions are quite tricky. As data scientists, the current approach is just to experiment with the possibilities that make more sense, evaluate, make another choice & repeat. This process…

algorithm, model search, search algorithm, (13 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.51)

Add feedback

A Classical Search Game in Discrete Locations

Clarkson, Jake, Lin, Kyle Y., Glazebrook, Kevin D.

arXiv.org Machine LearningMar-8-2021

Consider a two-person zero-sum search game between a hider and a searcher. The hider hides among $n$ discrete locations, and the searcher successively visits individual locations until finding the hider. Known to both players, a search at location $i$ takes $t_i$ time units and detects the hider -- if hidden there -- independently with probability $q_i$, for $i=1,\ldots,n$. The hider aims to maximize the expected time until detection, while the searcher aims to minimize it. We prove the existence of an optimal strategy for each player. In particular, the hider's optimal mixed strategy hides in each location with a nonzero probability, and the searcher's optimal mixed strategy can be constructed with up to $n$ simple search sequences. We develop an algorithm to compute an optimal strategy for each player, and compare the optimal hiding strategy with the simple hiding strategy which gives the searcher no location preference at the beginning of the search.

gittin search sequence, search sequence, sequence, (17 more...)

arXiv.org Machine Learning

2103.0931

Country:

North America > United States > New York (0.04)
Europe > United Kingdom (0.04)
North America > United States > Maryland > Montgomery County > Rockville (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Game Theory (0.86)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.48)

Add feedback

Sparsification for Fast Optimal Multi-Robot Path Planning in Lazy Compilation Schemes

Surynek, Pavel

arXiv.org Artificial IntelligenceMar-7-2021

Path planning for multiple robots (MRPP) represents a task of finding non-colliding paths for robots through which they can navigate from their initial positions to specified goal positions. The problem is usually modeled using undirected graphs where robots move between vertices across edges. Contemporary optimal solving algorithms include dedicated search-based methods, that solve the problem directly, and compilation-based algorithms that reduce MRPP to a different formalism for which an efficient solver exists, such as constraint programming (CP), mixed integer programming (MIP), or Boolean satisfiability (SAT). In this paper, we enhance existing SAT-based algorithm for MRPP via spar-tification of the set of candidate paths for each robot from which target Boolean encoding is derived. Suggested sparsification of the set of paths led to smaller target Boolean formulae that can be constructed and solved faster while optimality guarantees of the approach have been kept.

algorithm, conflict, robot, (12 more...)

arXiv.org Artificial Intelligence

2103.04496

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Washington > King County > Bellevue (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
(5 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.87)
Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.86)

Add feedback

Top 8 Approaches For Tuning Hyperparameters Of ML Models

#artificialintelligenceMar-6-2021, 10:05:55 GMT

Hyperparameter tuning is one of the fundamental steps in the machine learning routine. Also known as hyperparameter optimisation, the method entails searching for the best configuration of hyperparameters to enable optimal performance. Machine learning algorithms require user-defined inputs to achieve a balance between accuracy and generalisability. This process is known as hyperparameter tuning. There are various tools and approaches available to tune hyperparameters.

algorithm, hyperparameter, optimisation, (10 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.55)

Add feedback

Approximation Algorithms for Active Sequential Hypothesis Testing

Gan, Kyra, Jia, Su, Li, Andrew

arXiv.org Machine LearningMar-6-2021

In the problem of active sequential hypotheses testing (ASHT), a learner seeks to identify the true hypothesis $h^*$ from among a set of hypotheses $H$. The learner is given a set of actions and knows the outcome distribution of any action under any true hypothesis. While repeatedly playing the entire set of actions suffices to identify $h^*$, a cost is incurred with each action. Thus, given a target error $\delta>0$, the goal is to find the minimal cost policy for sequentially selecting actions that identify $h^*$ with probability at least $1 - \delta$. This paper provides the first approximation algorithms for ASHT, under two types of adaptivity. First, a policy is partially adaptive if it fixes a sequence of actions in advance and adaptively decides when to terminate and what hypothesis to return. Under partial adaptivity, we provide an $O\big(s^{-1}(1+\log_{1/\delta}|H|)\log (s^{-1}|H| \log |H|)\big)$-approximation algorithm, where $s$ is a natural separation parameter between the hypotheses. Second, a policy is fully adaptive if action selection is allowed to depend on previous outcomes. Under full adaptivity, we provide an $O(s^{-1}\log (|H|/\delta)\log |H|)$-approximation algorithm. We numerically investigate the performance of our algorithms using both synthetic and real-world data, showing that our algorithms outperform a previously proposed heuristic policy.

algorithm, decision tree, hypothesis, (15 more...)

arXiv.org Machine Learning

2103.0425

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.28)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.61)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.47)

Add feedback

Team formation techniques in education

AIHubMar-4-2021, 10:22:01 GMT

Collaborative learning is gaining acceptance as one of the most successful educational approaches to learning. The basic idea is to organise learners in groups to work together and solve problems or complete tasks. There is ample evidence that when learners actively engage in discussions, listen to different viewpoints, and defend their positions, they better understand new concepts and learn faster. A particular case of collaborative learning is co-operative learning, where each student is responsible for at least one specific aspect or competence needed to solve the problem jointly. The student is improving her understanding through collaboration with others and is also responsible for the group's success concerning the aspect she is responsible for.

competence, competency, student, (11 more...)

AIHub

Industry: Education (0.78)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.75)

Add feedback

Learning to Schedule DAG Tasks

Hua, Zhigang, Qi, Feng, Liu, Gan, Yang, Shuang

arXiv.org Artificial IntelligenceMar-4-2021

Scheduling computational tasks represented by directed acyclic graphs (DAGs) is challenging because of its complexity. Conventional scheduling algorithms rely heavily on simple heuristics such as shortest job first (SJF) and critical path (CP), and are often lacking in scheduling quality. In this paper, we present a novel learning-based approach to scheduling DAG tasks. The algorithm employs a reinforcement learning agent to iteratively add directed edges to the DAG, one at a time, to enforce ordering (i.e., priorities of execution and resource allocation) of "tricky" job nodes. By doing so, the original DAG scheduling problem is dramatically reduced to a much simpler proxy problem, on which heuristic scheduling algorithms such as SJF and CP can be efficiently improved. Our approach can be easily applied to any existing heuristic scheduling algorithms. On the benchmark dataset of TPC-H, we show that our learning based approach can significantly improve over popular heuristic algorithms and consistently achieves the best performance among several methods under a variety of settings.

algorithm, makespan, node, (16 more...)

arXiv.org Artificial Intelligence

2103.03412

Country:

North America > United States > California > Santa Clara County > Sunnyvale (0.04)
Europe > France (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.96)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.68)

Add feedback