Country
Using Imagery to Simplify Perceptual Abstraction in Reinforcement Learning Agents
Wintermute, Samuel (University of Michigan, Ann Arbor)
In this paper, we consider the problem of reinforcement learning in spatial tasks. These tasks have many states that can be aggregated together to improve learning efficiency. In an agent, this aggregation can take the form of selecting appropriate perceptual processes to arrive at a qualitative abstraction of the underlying continuous state. However, for arbitrary problems, an agent is unlikely to have the perceptual processes necessary to discriminate all relevant states in terms of such an abstraction. To help compensate for this, reinforcement learning can be integrated with an imagery system, where simple models of physical processes are applied within a low-level perceptual representation to predict the state resulting from an action. Rather than abstracting the current state, abstraction can be applied to the predicted next state. Formally, it is shown that this integration broadens the class of perceptual abstraction methods that can be used while preserving the underlying problem. Empirically, it is shown that this approach can be used in complex domains, and can be beneficial even when formal requirements are not met.
Keyword Extraction and Headline Generation Using Novel Word Features
Xu, Songhua (Yale University) | Yang, Shaohui (University of Hong Kong) | Lau, Francis (University of Hong Kong)
We introduce several novel word features for keyword extraction and headline generation. These new word features are derived according to the background knowledge of a document as supplied by Wikipedia. Given a document, to acquire its background knowledge from Wikipedia, we first generate a query for searching the Wikipedia corpus based on the key facts present in the document. We then use the query to find articles in the Wikipedia corpus that are closely related to the contents of the document. With the Wikipedia search result article set, we extract the inlink, outlink, category and infobox information in each article to derive a set of novel word features which reflect the document's background knowledge. These newly introduced word features offer valuable indications on individual words' importance in the input document. They serve as nice complements to the traditional word features derivable from explicit information of a document. In addition, we also introduce a word-document fitness feature to charcterize the influence of a document's genre on the keyword extraction and headline generation process. We study the effectiveness of these novel word features for keyword extraction and headline generation by experiments and have obtained very encouraging results.
Efficient Belief Propagation for Utility Maximization and Repeated Inference
Nath, Aniruddh (University of Washington) | Domingos, Pedro (University of Washington)
Many problems require repeated inference on probabilistic graphical models, with different values for evidence variables or other changes. Examples of such problems include utility maximization, MAP inference, online and interactive inference, parameter and structure learning, and dynamic inference. Since small changes to the evidence typically only affect a small region of the network, repeatedly performing inference from scratch can be massively redundant. In this paper, we propose expanding frontier belief propagation (EFBP), an efficient approximate algorithm for probabilistic inference with incremental changes to the evidence (or model). EFBP is an extension of loopy belief propagation (BP) where each run of inference reuses results from the previous ones, instead of starting from scratch with the new evidence; messages are only propagated in regions of the network affected by the changes. We provide theoretical guarantees bounding the difference in beliefs generated by EFBP and standard BP, and apply EFBP to the problem of expected utility maximization in influence diagrams. Experiments on viral marketing and combinatorial auction problems show that EFBP can converge much faster than BP without significantly affecting the quality of the solutions.
A Low False Negative Filter for Detecting Rare Bird Species from Short Video Segments using a Probable Observation Data Set-based EKF Method
Song, Dezhen (Texas A&M University) | Xu, Yiliang (Texas A&M University)
We report a new filter for assisting the search for rare bird species. Since a rare bird only appears in front of the camera with very low occurrence (e.g. less than ten times per year) for very short duration (e.g. less than a fraction of a second), our algorithm must have very low false negative rate. We verify the bird body axis information with the known bird flying dynamics from the short video segment. Since a regular extended Kalman filter (EKF) cannot converge due to high measurement error and limited data, we develop a novel Probable Observation Data Set (PODS)-based EKF method. The new PODS-EKF searches the measurement error range for all probable observation data that ensures the convergence of the corresponding EKF in short time frame. The algorithm has been extensively tested in experiments. The results show that the algorithm achieves 95.0% area under ROC curve in physical experiment with close to zero false negative rate.
Planning in Dynamic Environments: Extending HTNs with Nonlinear Continuous Effects
Molineaux, Matthew (Knexus Research Corporation) | Klenk, Matthew (Naval Research Laboratory) | Aha, David (Naval Research Laboratory)
Planning in dynamic continuous environments requires reasoning about nonlinear continuous effects, which previous Hierarchical Task Network (HTN) planners do not support. In this paper, we extend an existing HTN planner with a new state projection algorithm. To our knowledge, this is the first HTN planner that can reason about nonlinear continuous effects. We use a wait action to instruct this planner to consider continuous effects in a given state. We also introduce a new planning domain to demonstrate the benefits of planning with nonlinear continuous effects. We compare our approach with a linear continuous effects planner and a discrete effects HTN planner on a benchmark domain, which reveals that its additional costs are largely mitigated by domain knowledge. Finally, we present an initial application of this algorithm in a practical domain, a Navy training simulation, illustrating the utility of this approach for planning in dynamic continuous environments.
Private and Third-Party Randomization in Risk-Sensitive Equilibrium Concepts
Brautbar, Mickey (University of Pennsylvania) | Kearns, Michael (University of Pennsylvania) | Syed, Umar (University of Pennsylvania)
We consider risk-sensitive generalizations of Nash and correlated equilibria in noncooperative games. We prove that, except for a class of degenerate games, unless a two-player game has a pure Nash equilibrium, it does not have a risk-sensitive Nash equilibrium. We also show that every game has a risk-sensitive correlated equilibrium. The striking contrast between these existence results is due to the different sources of randomization in Nash (private randomization) and correlated equilibria (third-party randomization).
Latent Variable Model for Learning in Pairwise Markov Networks
Amizadeh, Saeed (University of Pittsburgh) | Hauskrecht, Milos (University of Pittsburgh)
Pairwise Markov Networks (PMN) are an important class of Markov networks which, due to their simplicity, are widely used in many applications such as image analysis, bioinformatics, sensor networks, etc. However, learning of Markov networks from data is a challenging task; there are many possible structures one must consider and each of these structures comes with its own parameters making it easy to overfit the model with limited data. To deal with the problem, recent learning methods build upon the L1 regularization to express the bias towards sparse network structures. In this paper, we propose a new and more flexible framework that let us bias the structure, that can, for example, encode the preference to networks with certain local substructures which as a whole exhibit some special global structure. We experiment with and show the benefit of our framework on two types of problems: learning of modular networks and learning of traffic networks models.
Parallel Depth First Proof Number Search
Kaneko, Tomoyuki (The University of Tokyo)
The depth first proof number search (df-pn) is an effective and popular algorithm for solving and-or tree problems by using proof and disproof numbers. This paper presents a simple but effective parallelization of the df-pn search algorithm for a shared-memory system. In this parallelization, multiple agents autonomously conduct the df-pn with a shared transposition table. For effective cooperation of agents, virtual proof and disproof numbers are introduced for each node, which is an estimation of future proof and disproof numbers by using the number of agents working on the node's descendants as a possible increase. Experimental results on large checkmate problems in shogi, which is a popular chess variant in Japan, show that reasonable increases in speed were achieved with small overheads in memory.
An Approximate Subgame-Perfect Equilibrium Computation Technique for Repeated Games
Burkov, Andriy (Laval University) | Chaib-draa, Brahim (Laval University)
This paper presents a technique for approximating, up to any precision, the set of subgame-perfect equilibria (SPE) in repeated games with discounting. The process starts with a single hypercube approximation of the set of SPE payoff profiles. Then the initial hypercube is gradually partitioned on to a set of smaller adjacent hypercubes, while those hypercubes that cannot contain any SPE point are gradually withdrawn. Whether a given hypercube can contain an equilibrium point is verified by an appropriate mixed integer program. A special attention is paid to the question of extracting players' strategies and their representability in form of finite automata.
Symmetry Detection in General Game Playing
Schiffel, Stephan (Dresden University of Technology)
We develop a method for detecting symmetries in arbitrary games and exploiting these symmetries when using tree search to play the game. Games in the General Game Playing domain are given as a set of logic based rules defining legal moves, their effects and goals of the players. The presented method transforms the rules of a game into a vertex-labeled graph such that automorphisms of the graph correspond with symmetries of the game. The algorithm detects many kinds of symmetries that often occur in games, e.g., rotation and reflection symmetries of boards, interchangeable objects, and symmetric roles. A transposition table is used to efficiently exploit the symmetries in many games.