Agents
Decision Making in Complex Multiagent Contexts: A Tale of Two Frameworks
Doshi, Prashant J. (University of Georgia)
Decision making is a key feature of autonomous systems. It involves choosing optimally between different lines of action in various information contexts that range from perfectly knowing all aspects of the decision problem to having just partial knowledge about it. The physical context often includes other interacting autonomous systems, typically called agents. In this article, I focus on decision making in a multiagent context with partial information about the problem. Relevant research in this complex but realistic setting has converged around two complementary, general frameworks and also introduced myriad specializations on its way. I put the two frameworks, decentralized partially observable Markov decision process (Dec-POMDP) and the interactive partially observable Markov decision process (I-POMDP), in context and review the foundational algorithms for these frameworks, while briefly discussing the advances in their specializations. I conclude by examining the avenues that research pertaining to these frameworks is pursuing.
Action-Model Based Multi-agent Plan Recognition
Zhuo, Hankz H., Yang, Qiang, Kambhampati, Subbarao
Multi-Agent Plan Recognition (MAPR) aims to recognize dynamic team structures and team behaviors from the observed team traces (activity sequences) of a set of intelligent agents. Previous MAPR approaches required a library of team activity sequences (team plans) be given as input. However, collecting a library of team plans to ensure adequate coverage is often difficult and costly. In this paper, we relax this constraint, so that team plans are not required to be provided beforehand. We assume instead that a set of action models are available. Such models are often already created to describe domain physics; i.e., the preconditions and effects of effects actions. We propose a novel approach for recognizing multi-agent team plans based on such action models rather than libraries of team plans. We encode the resulting MAPR problem as a \emph{satisfiability problem} and solve the problem using a state-of-the-art weighted MAX-SAT solver. Our approach also allows for incompleteness in the observed plan traces. Our empirical studies demonstrate that our algorithm is both effective and efficient in comparison to state-of-the-art MAPR methods based on plan libraries.
PROTECT -- A Deployed Game Theoretic System for Strategic Security Allocation for the United States Coast Guard
An, Bo (University of Southern California) | Shieh, Eric (University of Southern California) | Tambe, Milind (University of Southern California) | Yang, Rong (University of Southern California) | Baldwin, Craig (United States Coast Guard) | DiRenzo, Joseph (United States Coast Guard) | Maule, Ben (United States Coast Guard) | Meyer, Garrett (United States Coast Guard)
While three deployed applications of game theory for security have recently been reported, we as a community of agents and AI researchers remain in the early stages of these deployments; there is a continuing need to understand the core principles for innovative security applications of game theory. Towards that end, this paper presents PROTECT, a game-theoretic system deployed by the United States Coast Guard (USCG) in the port of Boston for scheduling their patrols. USCG has termed the deployment of PROTECT in Boston a success, and efforts are underway to test it in the port of New York, with the potential for nationwide deployment.PROTECT is premised on an attacker-defender Stackelberg game model and offers five key innovations. First, this system is a departure from the assumption of perfect adversary rationality noted in previous work, relying instead on a quantal response (QR) model of the adversary's behavior --- to the best of our knowledge, this is the first real-world deployment of the QR model. Second, to improve PROTECT's efficiency, we generate a compact representation of the defender's strategy space, exploiting equivalence and dominance. Third, we show how to practically model a real maritime patrolling problem as a Stackelberg game. Fourth, our experimental results illustrate that PROTECT's QR model more robustly handles real-world uncertainties than a perfect rationality model. Finally, in evaluating PROTECT, this paper for the first time provides real-world data: (i) comparison of human-generated vs PROTECT security schedules, and (ii) results from an Adversarial Perspective Team's (human mock attackers) analysis.
TRUSTS: Scheduling Randomized Patrols for Fare Inspection in Transit Systems Using Game Theory
Yin, Zhengyu (University of Southern California) | Jiang, Albert Xin (University of Southern California) | Tambe, Milind (University of Southern California) | Kiekintveld, Christopher (University of Texas at El Paso) | Leyton-Brown, Kevin (University of British Columbia) | Sandholm, Tuomas (Carnegie Mellon University) | Sullivan, John P. (Los Angeles County Sheriff's Department)
In proof-of-payment transit systems, passengers are legally required to purchase tickets before entering but are not physically forced to do so. Instead, patrol units move about the transit system, inspecting the tickets of passengers, who face fines if caught fare evading. The deterrence of fare evasion depends on the unpredictability and effectiveness of the patrols. In this paper, we present TRUSTS, an application for scheduling randomized patrols for fare inspection in transit systems. TRUSTS models the problem of computing patrol strategies as a leader-follower Stackelberg game where the objective is to deter fare evasion and hence maximize revenue. This problem differs from previously studied Stackelberg settings in that the leader strategies must satisfy massive temporal and spatial constraints; moreover, unlike in these counterterrorism-motivated Stackelberg applications, a large fraction of the ridership might realistically consider fare evasion, and so the number of followers is potentially huge. A third key novelty in our work is deliberate simplification of leader strategies to make patrols easier to be executed. We present an efficient algorithm for computing such patrol strategies and present experimental results using real-world ridership data from the Los Angeles Metro Rail system. The Los Angeles County Sheriffโs department is currently carrying out trials of TRUSTS.
Towards Adapting Cars to their Drivers
Rosenfeld, Avi (Jerusalem College of Technology) | Bareket, Zevi (University of Michigan) | Goldman, Claudia V. (General Motors) | Kraus, Sarit (Bar-Ilan University) | LeBlanc, David J. (University of Michigan) | Tsimhoni, Omer (General Motors)
Traditionally, vehicles have been considered as machines that are controlled by humans for the purpose of transportation. A more modern view is to envision drivers and passengers as actively interacting with a complex automated system. Such interactive activity leads us to consider intelligent and advanced ways of interaction leading to cars that can adapt to their drivers.In this paper, we focus on the Adaptive Cruise Control (ACC) technology that allows a vehicle to automatically adjust its speed to maintain a preset distance from the vehicle in front of it based on the driverโs preferences. Although individual drivers have different driving styles and preferences, current systems do not distinguish among users. We introduce a method to combine machine learning algorithms with demographic information and expert advice into existing automated assistive systems. This method can reduce the interactions between drivers and automated systems by adjusting parameters relevant to the operation of these systems based on their specific drivers and context of drive. We also learn when users tend to engage and disengage the automated system. This method sheds light on the kinds of dynamics that users develop while interacting with automation and can teach us how to improve these systems for the benefit of their users. While generic packages such as Weka were successful in learning driversโ behavior, we found that improved learning models could be developed by adding information on driversโ demographics and a previously developed model about different driver types. We present the general methodology of our learning procedure and suggest applications of our approach to other domains as well.
What Question Would Turing Pose Today?
Grosz, Barbara (Harvard University)
In 1950, when Turing proposed to replace the question "Can machines think?" with the question "Are there imaginable digital computers which would do well in the imitation game?" computer science was not yet a field of study, Shannonโs theory of information had just begun to change the way people thought about communication, and psychology was only starting to look beyond behaviorism. It is stunning that so many predictions in Turingโs 1950 Mind paper were right. In the decades since that paper appeared, with its inspiring challenges, research in computer science, neuroscience, and the behavioral sciences has radically changed thinking about mental processes and communication, and the ways in which people use computers has evolved even more dramatically. Turing, were he writing now, might still replace "Can machines think?" with an operational challenge, but it is likely he would propose a very different test. This paper considers what that might be in light of Turingโs paper and advances in the decades since it was written.
Interpreting prediction markets: a stochastic approach
Frongillo, Rafael M., Penna, Nicholรกs Della, Reid, Mark D.
We strengthen recent connections between prediction markets and learning byshowing that a natural class of market makers can be understood as performing stochastic mirror descent when trader demands are sequentially drawnfrom a fixed distribution. This provides new insights into how market prices (and price paths) may be interpreted as a summary of the market's belief distribution by relating them to the optimization problem being solved. In particular, we show that under certain conditions the stationary pointof the stochastic process of prices generated by the market is equal to the market's Walrasian equilibrium of classic market analysis. Together, these results suggest how traditional market making mechanisms might be replaced with general purpose learning algorithms while still retaining guaranteesabout their behaviour.
Approximating Equilibria in Sequential Auctions with Incomplete Information and Multi-Unit Demand
Greenwald, Amy, Li, Jiacui, Sodomka, Eric
In many large economic markets, goods are sold through sequential auctions. Such domains include eBay, online ad auctions, wireless spectrum auctions, and the Dutch flower auctions. Bidders in these domains face highly complex decision-making problems, as their preferences for outcomes in one auction often depend on the outcomes of other auctions, and bidders have limited information about factors that drive outcomes, such as other bidders' preferences and past actions. In this work, we formulate the bidder's problem as one of price prediction (i.e., learning) and optimization. We define the concept of stable price predictions and show that (approximate) equilibrium in sequential auctions can be characterized as a profile of strategies that (approximately) optimize with respect to such (approximately) stable price predictions. We show how equilibria found with our formulation compare to known theoretical equilibria for simpler auction domains, and we find new approximate equilibria for a more complex auction domain where analytical solutions were heretofore unknown.
Rational inference of relative preferences
Srivastava, Nisheeth, Schrater, Paul R.
Statistical decision theory axiomatically assumes that the relative desirability of different options that humans perceive is well described by assigning them option-specific scalar utility functions. However, this assumption is refuted by observed human behavior, including studies wherein preferences have been shown to change systematically simply through variation in the set of choice options presented. In this paper, we show that interpreting desirability as a relative comparison between available options at any particular decision instance results in a rational theory of value-inference that explains heretofore intractable violations of rational choice behavior in human subjects. Complementarily, we also characterize the conditions under which a rational agent selecting optimal options indicated by dynamic value inference in our framework will behave identically to one whose preferences are encoded using a static ordinal utility function.
Design of Intelligent Agents Based System for Commodity Market Simulation with JADE
Refianti, R., Mutiara, A. B., Gunawan, H.
A market of potato commodity for industry scale usage is engaging several types of actors. They are farmers, middlemen, and industries. A multi-agent system has been built to simulate these actors into agent entities, based on manually given parameters within a simulation scenario file. Each type of agents has its own fuzzy logic representing actual actors' knowledge, to be used to interpreting values and take appropriated decision of it while on simulation. The system will simulate market activities with programmed behaviors then produce the results as spreadsheet and chart graph files. These results consist of each agent's yearly finance and commodity data. The system will also predict each of next value from these outputs.