A typical example of this approach is expressing knowledge by rules. The underlying assumption is that these representations are compact abstractions of the known individual instances of the knowledge concept to be encoded. These abstractions can be static or dynamic, i.e., they can be compiled into the program directly, or adaptively synthesized during the application execution with a learning procedure. This approach is applicable for representing well-organized knowledge. However, such a summarization approach has faced substantial problems in applications where the concepts required for the knowledge are highly interconnected and have a large amount of irregularities (exceptions), e.g., "common sense".
We introduce a generalized dual decomposition bound for computing the maximum expected utility of influence diagrams based on the dual decomposition method generalized to $L^p$ space. The main goal is to devise an approximation scheme free from translations required by existing variational approaches while exploiting the local structure of sum of utility functions as well as the conditional independence of probability functions. In this work, the generalized dual decomposition method is applied to the algebraic framework called valuation algebra for influence diagrams which handles probability and expected utility as a pair. The proposed approach allows a sequential decision problem to be decomposed as a collection of sub-decision problems of bounded complexity and the upper bound of maximum expected utility to be computed by combining the local expected utilities. Thus, it has a flexible control of space and time complexity for computing the bound. In addition, the upper bounds can be further minimized by reparameterizing the utility functions. Since the global objective function for the minimization is nonconvex, we present a gradient-based local search algorithm in which the outer loop controls the randomization of the initial configurations and the inner loop tightens the upper-bound based on block coordinate descent with gradients perturbed by a random noise. The experimental evaluation demonstrates highlights of the proposed approach on finite horizon MDP/POMDP instances.
Planning in real-time offers several benefits over the more typical techniques of implementing Non-Player Character (NPC) behavior with scripts or finite state machines. NPCs that plan their actions dynamically are better equipped to handle unexpected situations. The modular nature of the goals and actions that make up the plan facilitates reuse, sharing, and maintenance of behavioral building blocks. These benefits, however, come at the cost of CPU cycles. In order to simultaneously plan for several NPCs in real-time, while continuing to share the processor with the physics, animation, and rendering systems, careful consideration must taken with the supporting architecture. The architecture must support distributed processing and caching of costly calculations. These considerations have impacts that stretch beyond the architecture of the planner, and affect the agent architecture as a whole. This paper describes lessons learned while implementing real-time planning for NPCs for F.E.A.R., a AAA first person shooter shipping for PC in 2005.
Decision-theoretic preferences specify the relative desirability of all possible outcomes of alternative plans. In order to express general patterns of preference holding in a domain, we require a language that can refer directly to preferences over classes of outcomes as well as individuals. We present the basic concepts of a theory of meaning for such generic comparatives to facilitate their incremental capture and exploitation in automated reasoning systems. Our semantics lifts comparisons of individuals to comparisons of classes "other things being equal" by means of contextual equivalences, equivalence relations among individuals that vary with the context of application. We discuss implications of the theory for representing preference information.