Goto

Collaborating Authors

 Singh, Satinder


Reports of the AAAI 2011 Conference Workshops

AI Magazine

The AAAI-11 workshop program was held Sunday and Monday, August 7–18, 2011, at the Hyatt Regency San Francisco in San Francisco, California USA. The AAAI-11 workshop program included 15 workshops covering a wide range of topics in artificial intelligence. The titles of the workshops were Activity Context Representation: Techniques and Languages; Analyzing Microtext; Applied Adversarial Reasoning and Risk Modeling; Artificial Intelligence and Smarter Living: The Conquest of Complexity; AI for Data Center Management and Cloud Computing; Automated Action Planning for Autonomous Mobile Robots; Computational Models of Natural Argument; Generalized Planning; Human Computation; Human-Robot Interaction in Elder Care; Interactive Decision Theory and Game Theory; Language-Action Tools for Cognitive Artificial Agents: Integrating Vision, Action and Language; Lifelong Learning; Plan, Activity, and Intent Recognition; and Scalable Integration of Analytics and Visualization. This article presents short summaries of those events.


Reports of the AAAI 2011 Conference Workshops

AI Magazine

The AAAI-11 workshop program was held Sunday and Monday, August 7–18, 2011, at the Hyatt Regency San Francisco in San Francisco, California USA. The AAAI-11 workshop program included 15 workshops covering a wide range of topics in artificial intelligence. The titles of the workshops were Activity Context Representation: Techniques and Languages; Analyzing Microtext; Applied Adversarial Reasoning and Risk Modeling; Artificial Intelligence and Smarter Living: The Conquest of Complexity; AI for Data Center Management and Cloud Computing; Automated Action Planning for Autonomous Mobile Robots; Computational Models of Natural Argument; Generalized Planning; Human Computation; Human-Robot Interaction in Elder Care; Interactive Decision Theory and Game Theory; Language-Action Tools for Cognitive Artificial Agents: Integrating Vision, Action and Language; Lifelong Learning; Plan, Activity, and Intent Recognition; and Scalable Integration of Analytics and Visualization. This article presents short summaries of those events.


Security Games with Limited Surveillance: An Initial Report

AAAI Conferences

Stackelberg games have been used in several deployed applications of game theory to make recommendations for allocating limited resources for protecting critical infrastructure. The resource allocation strategies are randomized to prevent a strategic attacker from using surveillance to learn and exploit patterns in the allocation. An important limitation of previous work on security games is that it typically assumes that attackers have perfect surveillance capabilities, and can learn the exact strategy of the defender. We introduce a new model that explicitly models the process of an attacker observing a sequence of resource allocation decisions and updating his beliefs about the defender's strategy. For this model we present computational techniques for updating the attacker's beliefs and computing optimal strategies for both the attacker and defender, given a specific number of observations. We provide multiple formulations for computing the defender's optimal strategy, including non-convex programming and a convex approximation. We also present an approximate method for computing the optimal length of time for the attacker to observe the defender's strategy before attacking. Finally, we present experimental results comparing the efficiency and runtime of our methods.


Variance-Based Rewards for Approximate Bayesian Reinforcement Learning

arXiv.org Machine Learning

The explore{exploit dilemma is one of the central challenges in Reinforcement Learning (RL). Bayesian RL solves the dilemma by providing the agent with information in the form of a prior distribution over environments; however, full Bayesian planning is intractable. Planning with the mean MDP is a common myopic approximation of Bayesian planning. We derive a novel reward bonus that is a function of the posterior distribution over environments, which, when added to the reward in planning with the mean MDP, results in an agent which explores efficiently and effectively. Although our method is similar to existing methods when given an uninformative or unstructured prior, unlike existing methods, our method can exploit structured priors. We prove that our method results in a polynomial sample complexity and empirically demonstrate its advantages in a structured exploration task.


Comparing Action-Query Strategies in Semi-Autonomous Agents

AAAI Conferences

We consider settings in which a semi-autonomous agent has uncertain knowledge about its environment, but can ask what action the human operator would prefer taking in the current or in a potential future state. Asking queries can improve behavior, but if queries come at a cost (e.g., due to limited operator attention), the value of each query should be maximized. We compare two strategies for selecting action queries: 1) based on myopically maximizing expected gain in long-term value, and 2) based on myopically minimizing uncertainty in the agent's policy representation. We show empirically that the first strategy tends to select more valuable queries, and that a hybrid method can outperform either method alone in settings with limited computation.


Optimal Rewards versus Leaf-Evaluation Heuristics in Planning Agents

AAAI Conferences

Planning agents often lack the computational resources needed to build full planning trees for their environments. Agent designers commonly overcome this finite-horizon approximation by applying an evaluation function at the leaf-states of the planning tree. Recent work has proposed an alternative approach for overcoming computational constraints on agent design: modify the reward function. In this work, we compare this reward design approach to the common leaf-evaluation heuristic approach for improving planning agents. We show that in many agents, the reward design approach strictly subsumes the leaf-evaluation approach, i.e., there exists a reward function for every leaf-evaluation heuristic that leads to equivalent behavior, but the converse is not true. We demonstrate that this generality leads to improved performance when an agent makes approximations in addition to the finite-horizon approximation. As part of our contribution, we extend PGRD, an online reward design algorithm, to develop reward design algorithms for Sparse Sampling and UCT, two algorithms capable of planning in large state spaces.


Dynamic Incentive Mechanisms

AI Magazine

Much of AI is concerned with the design of intelligent agents. As we extend the ideas of mechanism design from economic theory, the mechanisms (or rules) become algorithmic and many new challenges surface. Starting with a short background on mechanism design theory, the aim of this paper is to provide a nontechnical exposition of recent results on dynamic incentive mechanisms, which provide rules for the coordination of agents in sequential decision problems. The framework of dynamic mechanism design embraces coordinated decision-making both in the context of uncertainty about the world external to an agent and also in regard to the dynamics of agent preferences.


Dynamic Incentive Mechanisms

AI Magazine

Much of AI is concerned with the design of intelligent agents. A complementary challenge is to understand how to design “rules of encounter” by which to promote simple, robust and beneficial interactions between multiple intelligent agents. This is a natural development, as AI is increasingly used for automated decision making in real-world settings. As we extend the ideas of mechanism design from economic theory, the mechanisms (or rules) become algorithmic and many new challenges surface. Starting with a short background on mechanism design theory, the aim of this paper is to provide a nontechnical exposition of recent results on dynamic incentive mechanisms, which provide rules for the coordination of agents in sequential decision problems. The framework of dynamic mechanism design embraces coordinated decision-making both in the context of uncertainty about the world external to an agent and also in regard to the dynamics of agent preferences. In addition to tracing some recent developments, we point to ongoing research challenges.


Reports on the 2004 AAAI Fall Symposia

AI Magazine

The Association for the Advancement of Artificial Intelligence presented its 2004 Fall Symposium Series Friday through Sunday, October 22-24 at the Hyatt Regency Crystal City in Arlington, Virginia, adjacent to Washington, DC. The symposium series was preceded by a one-day AI funding seminar. The topics of the eight symposia in the 2004 Fall Symposia Series were: (1) Achieving Human-Level Intelligence through Integrated Systems and Research; (2) Artificial Multiagent Learning; (3) Compositional Connectionism in Cognitive Science; (4) Dialogue Systems for Health Communications; (5) The Intersection of Cognitive Science and Robotics: From Interfaces to Intelligence; (6) Making Pen-Based Interaction Intelligent and Natural; (7) Real- Life Reinforcement Learning; and (8) Style and Meaning in Language, Art, Music, and Design.


Reports on the 2004 AAAI Fall Symposia

AI Magazine

Learning) are also available as AAAI be integrated and (2) architectures Technical Reports. There through Sunday, October 22-24 at an opportunity for new and junior researchers--as was consensus among participants the Hyatt Regency Crystal City in Arlington, well as students and that metrics in machine learning, Virginia, adjacent to Washington, postdoctoral fellows--to get an inside planning, and natural language processing DC. The symposium series was look at what funding agencies expect have driven advances in those preceded on Thursday, October 21 by in proposals from prospective subfields, but that those metrics have a one-day AI funding seminar, which grantees. Representatives and program also distracted attention from how to was open to all registered attendees. The topic is of increasing interest Domains for motivating, testing, large numbers of agents, more complex with the advent of peer-to-peer network and funding this research were agent behaviors, partially observable services and with ad-hoc wireless proposed (some during our joint session environments, and mutual adaptation.