AITopics

Gast, Nicolas, Gaujal, Bruno, Boudec, Jean-Yves Le

Mean field for Markov Decision Processes: from Discrete to Continuous Optimization

arXiv.org Artificial IntelligenceMay-19-2011

We study the convergence of Markov Decision Processes made of a large number of objects to optimization problems on ordinary differential equations (ODE). We show that the optimal reward of such a Markov Decision Process, satisfying a Bellman equation, converges to the solution of a continuous Hamilton-Jacobi-Bellman (HJB) equation based on the mean field approximation of the Markov Decision Process. We give bounds on the difference of the rewards, and a constructive algorithm for deriving an approximating solution to the Markov Decision Process from a solution of the HJB equations. We illustrate the method on three examples pertaining respectively to investment strategies, population dynamics control and scheduling in queues are developed. They are used to illustrate and justify the construction of the controlled ODE and to show the gain obtained by solving a continuous HJB equation rather than a large discrete Bellman equation.

artificial intelligence, machine learning, optimization problem, (14 more...)

arXiv.org Artificial Intelligence

1004.2342

Country: Europe > France (0.28)

Genre: Research Report (0.63)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

arXiv.org Artificial IntelligenceMay-19-2011

Typical models: minimizing false beliefs

Lozinskii, Eliezer L.

A knowledge system S describing a part of real world does in general not contain complete information. Reasoning with incomplete information is prone to errors since any belief derived from S may be false in the present state of the world. A false belief may suggest wrong decisions and lead to harmful actions. So an important goal is to make false beliefs as unlikely as possible. This work introduces the notions of "typical atoms" and "typical models", and shows that reasoning with typical models minimizes the expected number of false beliefs over all ways of using incomplete information. Various properties of typical models are studied, in particular, correctness and stability of beliefs suggested by typical models, and their connection to oblivious reasoning.

logic & formal reasoning, mod, nonmonotonic reasoning, (20 more...)

arXiv.org Artificial Intelligence

1105.3833

Country: North America > United States (0.68)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Nonmonotonic Logic (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

Aggregating Forecasts Using a Learned Bayesian Network

Mahoney, Suzanne Mitchell (Innovative Decisions, Inc.) | Comstock, Ethan (Innovative Decisions, Inc.) | deBlois, Bradley (Innovative Decisions, Inc.) | Darcy, Steven (Innovative Decisions, Inc.)

Under the Defense Advanced Research Project Agency's (DARPA) Integrated Crisis Early Warning System (ICEWS), Innovative Decisions, Inc. (IDI) constructed a Bayesian network to combine forecasts produced by a set of social science models. We used Bayesian network structure learning with political science variables to produce meaningful priors. We employed a naive Bayes structure to aggregate the forecasts. In both cases, IDI improved classification by intelligently discretizing continuous variables. The resulting network not only met performance criteria set by DARPA, but also out-performed each of the social science models across all types of forecasted events. We describe the construction of the aggregator as well as a set of experiments performed to explore the nature of the Bayesian EOI Aggregator's performance.

artificial intelligence, forecaster, machine learning, (15 more...)

Twenty-Fourth International FLAIRS Conference

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(3 more...)

Genre: Instructional Material (0.46)

Industry:

Government > Regional Government > North America Government > United States Government (0.91)
Government > Military (0.91)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Supplemental Case Acquisition Using Mixed-Initiative Control

Floyd, Michael William (Carleton University) | Esfandiari, Babak (Carleton University)

Learning by observation allows a software agent to learn by watching an expert perform a task. This transfers the burden of training from the expert, who would traditionally need to program the agent, to the agent itself. Most existing approaches to learning by observation perform their observation in a purely passive manner. We propose a case-based reasoning agent that is able to observe passively but can also use mixed-initiative control to request assistance from the expert for difficult input problems. Our agent uses mixed-initiative case acquisition in the game of Tetris. We show that the agent is able to obtain cases it would not have been able to with passive observation alone, is able to improve its performance and places less burden on the expert.

agent, case base, cbr system, (15 more...)

Twenty-Fourth International FLAIRS Conference

Country:

North America > Canada > Ontario > National Capital Region > Ottawa (0.14)
North America > United States > Texas (0.04)

Genre: Research Report (0.46)

Industry: Leisure & Entertainment > Games > Computer Games (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning (1.00)

Visual Programming of Plan Dynamics Using Constraints and Landmarks

Porteous, Julie (Teesside University) | Teutenberg, Jonathan (Teesside University) | Pizzi, David (Teesside University) | Cavazza, Marc (Teesside University)

In recent years, there has been considerable interest in the use of planning techniques in the area of new media. Many traditional planning notions no longer apply in the context of these applications. In particular, it can be difficult to answer the important question of what constitutes a good plan for the domain, but there is an emerging consensus that plan dynamics play an important role. As a consequence, it is important to support representation of such aspects. Our solution is to introduce a meta-level of representation that is an abstraction of the domain with respect to both time and causality, and to develop a visual representation of this in the form of a narrative arc. This visual representation can then be used in a visual programming approach to the exploration and specification of plan dynamics. In the paper we outline this approach to meta-level representation using constraints along with the visual programming interface we have developed. We illustrate the approach with examples of visual programming in the development of an interactive entertainment system based on Shakespeare's play ``The Merchant of Venice''

constraint, narrative, representation, (17 more...)

Twenty-First International Conference on Automated Planning and Scheduling

Country: Europe > United Kingdom > England > North Yorkshire > Middlesbrough (0.04)

Industry: Media (0.70)

Technology:

Information Technology > Visual Languages (1.00)
Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)

Hernandez, Carlos (Universidad Católica de la Santísima Concepción) | Baier, Jorge A. (Pontificia Universidad Católica de Chile)

Fast Subgoaling for Pathfinding via Real-Time Search

Real-time heuristic search is a standard approach to pathfind- ing when agents are required to make decisions in a bounded, very short period of time. An assumption usually made in the development and evaluation of real-time algorithms is that the environment is unknown. Nevertheless, in many interesting applications such as pathfinding for automnomous characters in video games, the environment is known in advance. Recent real-time search algorithms such as D LRTA* and kNN LRTA* exploit knowledge about the environment while pathfinding under real-time constraints. Key to those algorithms is the computation of subgoals in a preprocessing step. Subgoals are subsequently used in the online planning phase to obtain high-quality solutions. Preprocessing in those algorithms, however, requires significant computation. In this paper we propose a novel preprocessing algorithm that generates subgoals using a series of backward search episodes carried out from potential goals. The result of a single backward search episode is a tree of subgoals that we then use while planning online. We show the advantages of our approach over state-of-the-art algorithms by carrying out experiments on standard real-time search benchmarks.

algorithm, lrta, subgoal, (16 more...)

Twenty-First International Conference on Automated Planning and Scheduling

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
South America > Chile > Biobío Region > Concepción Province > Concepción (0.04)

Industry: Leisure & Entertainment > Games > Computer Games (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Architecture > Real Time Systems (1.00)

The Minimal Seed Set Problem

Gefen, Avitan (Ben-Gurion University) | Brafman, Ronen I. (Ben-Gurion University)

This paper defines and studies a new, interesting, and challenging benchmark problem that originates in systems biology. The minimal seed-set problem is defined as follows: given a description of the metabolic reactions of an organism, characterize the minimal set of nutrients with which it could synthesize all nutrients it is capable of synthesizing. Current methods used in systems biology yield only approximate solutions. And although it is natural to cast it as a planning problem, current optimal planners are unable to solve it, while non-optimal planners return plans that are very far from optimal. As a planning problem, it is inherently delete-free, has many zero-cost actions, all propositions are landmarks, and many legal permutations of the plan exist. We show how a simple uninformed search algorithm that exploits inherent independence between sub-goals can solve it optimally by reducing the branching factor drastically.

nutrient, reaction, source component, (15 more...)

Twenty-First International Conference on Automated Planning and Scheduling

Country: Asia > Middle East > Israel (0.04)

Industry: Health & Medicine (0.47)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)

Markov Decision Processes with Ordinal Rewards: Reference Point-Based Preferences

Weng, Paul (LIP6, UPMC)

In a standard Markov decision process (MDP), rewards are assumed to be precisely known and of quantitative nature. This can be a too strong hypothesis in some situations. When rewards can really be modeled numerically, specifying the reward function is often difficult as it is a cognitively-demanding and/or time-consuming task. Besides, rewards can sometimes be of qualitative nature as when they represent qualitative risk levels for instance. In those cases, it is problematic to use directly standard MDPs and we propose instead to resort to MDPs with ordinal rewards. Only a total order over rewards is assumed to be known. In this setting, we explain how an alternative way to define expressive and interpretable preferences using reference points can be exploited.

history, preference relation, reference point, (15 more...)

Twenty-First International Conference on Automated Planning and Scheduling

Country: Europe > France > Île-de-France > Paris > Paris (0.04)

Industry: Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Decision Support Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.71)

Exploiting the Computational Power of the Graphics Card: Optimal State Space Planning on the GPU

Sulewski, Damian (TZI, Universität Bremen) | Edelkamp, Stefan (TZI, Universität Bremen) | Kissmann, Peter (TZI, Universität Bremen)

In this paper optimal state space planning is parallelized by exploiting the processing power of a graphics card. The two exploration steps, namely selecting the actions to be applied and generating the successors, are performed on a graphics processing unit. Duplicate detection, however, is delayed to be executed on the central processing unit. Multiple cores are employed to bypass main memory latency. To increase processing speed for exact duplicate detection, the hash tables are lock-free. Moreover, a bucket-based representation enhances the concurrent distribution of frontier states. The planner supports cost-first exploration and is able to deal with a considerable fraction of current PDDL, including numerical state variables, complex objective functions, and goal preferences. It can maximize the net-benefit. Experimental findings show visible performance gains especially for larger benchmark problems.

bufferfill, duplicate detection, gpu, (16 more...)

Twenty-First International Conference on Automated Planning and Scheduling

Country:

Europe > Germany > Bremen > Bremen (0.04)
Europe > Germany > Baden-Württemberg > Freiburg (0.04)

Industry: Leisure & Entertainment > Games (0.31)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.91)