AITopics | Search

Collaborating Authors

Search

"Search is a problem-solving technique that systematically explores a space of problem states, i.e., successive and alternative stages in the problem-solving process. Examples of problem states might include the different board configurations in a game or intermediate steps in a reasoning process. This space of alternative solutions is then searched to find an answer. Newell and Simon (1976) have argued that this is the essential basis of human problem solving. Indeed, when a chess player examines the effects of different moves or a doctor considers a number of alternative diagnoses, they are searching among alternatives."
– from Section 1.2 of Chapter One of George F. Luger's textbook, Artificial Intelligence: Structures and Strategies for Complex Problem Solving, 5th Edition (Addison-Wesley; 2005).

News Overviews Instructional Materials AI-Alerts Classics

Monte-Carlo Planning in Large POMDPs

Silver, David, Veness, Joel

Neural Information Processing SystemsDec-31-2010

This paper introduces a Monte-Carlo algorithm for online planning in large POMDPs. The algorithm combines a Monte-Carlo update of the agent's belief state with a Monte-Carlo tree search from the current belief state. The new algorithm, POMCP, has two important properties. First, Monte-Carlo sampling is used to break the curse of dimensionality both during belief state updates and during planning. Second, only a black box simulator of the POMDP is required, rather than explicit probability distributions. These properties enable POMCP to plan effectively in significantly larger POMDPs than has previously been possible. We demonstrate its effectiveness in three large POMDPs. We scale up a well-known benchmark problem, Rocksample, by several orders of magnitude. We also introduce two challenging new POMDPs: 10x10 Battleship and Partially Observable PacMan, with approximately 10^18 and 10^56 states respectively. Our Monte-Carlo planning algorithm achieved a high level of performance with no prior knowledge, and was also able to exploit simple domain knowledge to achieve better results with less search. POMCP is the first general purpose planner to achieve high performance in such large and unfactored POMDPs.

artificial intelligence, machine learning, simulation, (19 more...)

Neural Information Processing Systems

Industry: Transportation (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

On a Connection between Importance Sampling and the Likelihood Ratio Policy Gradient

Jie, Tang, Abbeel, Pieter

Neural Information Processing SystemsDec-31-2010

Likelihood ratio policy gradient methods have been some of the most successful reinforcement learning algorithms, especially for learning on physical systems. We describe how the likelihood ratio policy gradient can be derived from an importance sampling perspective. This derivation highlights how likelihood ratio methods under-use past experience by (a) using the past experience to estimate {\em only} the gradient of the expected return $U(\theta)$ at the current policy parameterization $\theta$, rather than to obtain a more complete estimate of $U(\theta)$, and (b) using past experience under the current policy {\em only} rather than using all past experience to improve the estimates. We present a new policy search method, which leverages both of these observations as well as generalized baselines---a new technique which generalizes commonly used baseline techniques for policy gradient methods. Our algorithm outperforms standard likelihood ratio policy gradient algorithms on several testbeds.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

Hashing Hyperplane Queries to Near Points with Applications to Large-Scale Active Learning

Jain, Prateek, Vijayanarasimhan, Sudheendra, Grauman, Kristen

Neural Information Processing SystemsDec-31-2010

We consider the problem of retrieving the database points nearest to a given {\em hyperplane} query without exhaustively scanning the database. We propose two hashing-based solutions. Our first approach maps the data to two-bit binary keys that are locality-sensitive for the angle between the hyperplane normal and a database point. Our second approach embeds the data into a vector space where the Euclidean norm reflects the desired distance between the original points and hyperplane query. Both use hashing to retrieve near points in sub-linear time. Our first method's preprocessing stage is more efficient, while the second has stronger accuracy guarantees. We apply both to pool-based active learning: taking the current hyperplane classifier as a query, our algorithm identifies those points (approximately) satisfying the well-known minimal distance-to-hyperplane selection criterion. We empirically demonstrate our methods' tradeoffs, and show that they make it practical to perform active selection with millions of unlabeled points.

artificial intelligence, machine learning, vector, (14 more...)

Neural Information Processing Systems

Country: North America > United States > Texas (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.48)

Add feedback

A Monte Carlo AIXI Approximation

Veness, Joel, Ng, Kee Siong, Hutter, Marcus, Uther, William, Silver, David

arXiv.org Artificial IntelligenceDec-26-2010

This paper introduces a principled approach for the design of a scalable general reinforcement learning agent. Our approach is based on a direct approximation of AIXI, a Bayesian optimality notion for general reinforcement learning agents. Previously, it has been unclear whether the theory of AIXI could motivate the design of practical algorithms. We answer this hitherto open question in the affirmative, by providing the first computationally feasible approximation to the AIXI agent. To develop our approximation, we introduce a new Monte-Carlo Tree Search algorithm along with an agent-specific extension to the Context Tree Weighting algorithm. Empirically, we present a set of encouraging results on a variety of stochastic and partially observable domains. We conclude by proposing a number of directions for future research.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

0909.0801

Country:

Oceania > Australia > New South Wales (0.04)
North America > United States > Massachusetts (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)
(2 more...)

Add feedback

Best-First Heuristic Search for Multicore Machines

Burns, E., Lemons, S., Ruml, W., Zhou, R.

Journal of Artificial Intelligence ResearchDec-14-2010

To harness modern multicore processors, it is imperative to develop parallel versions of fundamental algorithms. In this paper, we compare different approaches to parallel best-first search in a shared-memory setting. We present a new method, PBNF, that uses abstraction to partition the state space and to detect duplicate states without requiring frequent locking. PBNF allows speculative expansions when necessary to keep threads busy. We identify and fix potential livelock conditions in our approach, proving its correctness using temporal logic. Our approach is general, allowing it to extend easily to suboptimal and anytime heuristic search. In an empirical comparison on STRIPS planning, grid pathfinding, and sliding tile puzzle problems using 8-core machines, we show that A*, weighted A* and Anytime weighted A* implemented using PBNF yield faster search than improved versions of previous parallel search proposals.

algorithm, nblock, node, (17 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.3094

AI Access Foundation

10680

Journal of Artificial Intelligence Research

Country:

North America > United States > Oklahoma > Payne County > Cushing (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > New Hampshire (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback

Distributed Graph Coloring: An Approach Based on the Calling Behavior of Japanese Tree Frogs

Hernández, Hugo, Blum, Christian

arXiv.org Artificial IntelligenceNov-24-2010

Graph coloring, also known as vertex coloring, considers the problem of assigning colors to the nodes of a graph such that adjacent nodes do not share the same color. The optimization version of the problem concerns the minimization of the number of used colors. In this paper we deal with the problem of finding valid colorings of graphs in a distributed way, that is, by means of an algorithm that only uses local information for deciding the color of the nodes. Such algorithms prescind from any central control. Due to the fact that quite a few practical applications require to find colorings in a distributed way, the interest in distributed algorithms for graph coloring has been growing during the last decade. As an example consider wireless ad-hoc and sensor networks, where tasks such as the assignment of frequencies or the assignment of TDMA slots are strongly related to graph coloring. The algorithm proposed in this paper is inspired by the calling behavior of Japanese tree frogs. Male frogs use their calls to attract females. Interestingly, groups of males that are located nearby each other desynchronize their calls. This is because female frogs are only able to correctly localize the male frogs when their calls are not too close in time. We experimentally show that our algorithm is very competitive with the current state of the art, using different sets of problem instances and comparing to one of the most competitive algorithms from the literature.

artificial intelligence, evolutionary algorithm, machine learning, (18 more...)

arXiv.org Artificial Intelligence

1011.5349

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.67)

Add feedback

A Utility-Theoretic Approach to Privacy in Online Services

Krause, A., Horvitz, E.

Journal of Artificial Intelligence ResearchNov-19-2010

Online offerings such as web search, news portals, and e-commerce applications face the challenge of providing high-quality service to a large, heterogeneous user base. Recent efforts have highlighted the potential to improve performance by introducing methods to personalize services based on special knowledge about users and their context. For example, a user's demographics, location, and past search and browsing may be useful in enhancing the results offered in response to web search queries. However, reasonable concerns about privacy by both users, providers, and government agencies acting on behalf of citizens, may limit access by services to such information. We introduce and explore an economics of privacy in personalization, where people can opt to share personal information, in a standing or on-demand manner, in return for expected enhancements in the quality of an online service. We focus on the example of web search and formulate realistic objective functions for search efficacy and privacy. We demonstrate how we can find a provably near-optimal optimization of the utility-privacy tradeoff in an efficient manner. We evaluate our methodology on data drawn from a log of the search activity of volunteer participants. We separately assess users preferences about privacy and utility via a large-scale survey, aimed at eliciting preferences about peoples willingness to trade the sharing of personal data in returns for gains in search efficiency. We show that a significant level of personalization can be achieved using a relatively small amount of information about users.

algorithm, information, privacy, (16 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.3089

AI Access Foundation

10678

Journal of Artificial Intelligence Research

Country:

Asia > Middle East > Lebanon (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > Washington > King County > Redmond (0.04)
(4 more...)

Genre:

Research Report > New Finding (0.93)
Questionnaire & Opinion Survey (0.93)

Industry:

Information Technology > Security & Privacy (1.00)
Information Technology > Services > e-Commerce Services (0.34)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)

Add feedback

Evolutionary Robustness Checking in the Artificial Anasazi Model

Stonedahl, Forrest (Northwestern University) | Wilensky, Uri (Northwestern University)

AAAI ConferencesNov-5-2010

Using the well-known Artificial Anasazi simulation for a case study, we investigate the use of genetic algorithms (GAs) for performing two common tasks related to robustness checking of agent-based models: parameter calibration and sensitivity analysis. In the calibration task, we demonstrate that a GA approach is able to find parameters that are equally good or better at minimizing error versus historical data, compared to a previous factorial grid-based approach. The GA approach also allows us to explore a wider range of parameters and parameter settings. Previous univariate sensitivity analysis on the Artificial Anasazi model did not consider potentially complex/nonlinear interactions between parameters. With the GA-based approach, we perform multivariate sensitivity analysis to discover how greatly the model can diverge from historical data, while the parameters are constrained within a close range of previously calibrated values. We show that by varying multiple parameters within a 10% range, the model can produce dramatically and qualitatively different results, and further demonstrate the utility of sensitivity analysis for model testing, by the discovery of a small coding error. Through this case study, we discuss some of the issues that can arise with calibration and sensitivity analysis of agent-based models.

evolutionary algorithm, janssen, machine learning, (17 more...)

AAAI Conferences

2010 AAAI Fall Symposium Series

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > Illinois > Cook County > Evanston (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(4 more...)

Genre: Research Report > Experimental Study (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.94)

Add feedback

A Partial Taxonomy of Substitutability and Interchangeability

Karakashian, Shant, Woodward, Robert, Choueiry, Berthe Y., Prestwhich, Steven, Freuder, Eugene C.

arXiv.org Artificial IntelligenceOct-22-2010

Substitutability, interchangeability and related concepts in Constraint Programming were introduced approximately twenty years ago and have given rise to considerable subsequent research. We survey this work, classify, and relate the different concepts, and indicate directions for future work, in particular with respect to making connections with research into symmetry breaking. This paper is a condensed version of a larger work in progress.

artificial intelligence, constraint-based reasoning, interchangeability, (14 more...)

arXiv.org Artificial Intelligence

1010.4609

Country:

North America > United States > Nebraska > Lancaster County > Lincoln (0.04)
Europe > Middle East > Cyprus > Pafos > Paphos (0.04)
Europe > Ireland > Munster > County Cork > Cork (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.94)

Add feedback

A Monte Carlo Approach for Football Play Generation

Laviers, Kennard (University of Central Florida) | Sukthankar, Gita (University of Central Florida)

AAAI ConferencesOct-10-2010

Learning effective policies in multi-agent adversarial games is a significant challenge since the search space can be prohibitively large when the actions of all the agents are considered simultaneously. Recent advances in Monte Carlo search methods have produced good results in single-agent games like Go with very large search spaces. In this paper, we propose a variation on the Monte Carlo method, UCT (Upper Confidence Bound Trees), for multi-agent, continuous-valued, adversarial games and demonstrate its utility at generating American football plays for Rush Football 2008. In football, like in many other multi-agent games, the actions of all of the agents are not equally crucial to gameplay success. By automatically identifying key players from historical game play, we can focus the UCT search on player groupings that have the largest impact on yardage gains in a particular formation.

artificial intelligence, node, subgroup, (13 more...)

AAAI Conferences

Sixth Artificial Intelligence and Interactive Digital Entertainment Conference

Country:

North America > United States > Florida > Orange County > Orlando (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New Jersey > Middlesex County > Piscataway (0.04)

Industry:

Leisure & Entertainment > Sports > Football (1.00)
Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback