If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."
However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …
Peer grading, allowing students/peers to evaluate others' assignments, offers a promising solution for scaling evaluation and learning to large-scale educational systems. A key challenge in peer grading is motivating peers to grade diligently. While existing spot-checking (SC) mechanisms can prevent peer collusion where peers coordinate to report the uninformative grade, they unrealistically assume that peers have the same grading reliability and cost. This paper studies the general Optimal Spot-Checking (OptSC) problem of determining the probability each assignment needs to be checked to maximize assignments' evaluation accuracy aggregated from peers, and takes into consideration 1) peers' heterogeneous characteristics, and 2) peers' strategic grading behaviors to maximize their own utility. We prove that the bilevel OptSC is NP-hard to solve. By exploiting peers' grading behaviors, we first formulate a single level relaxation to approximate OptSC. By further exploiting structural properties of the relaxed problem, we propose an efficient algorithm to that relaxation, which also gives a good approximation of the original OptSC. Extensive experiments on both synthetic and real datasets show significant advantages of the proposed algorithm over existing approaches.
Chen, Haipeng (Nanyang Technological University) | An, Bo (Nanyang Technological University) | Sharon, Guni (University of Texas at Austin) | Hanna, Josiah P. (University of Texas at Austin) | Stone, Peter (University of Texas at Austin) | Miao, Chunyan (Nanyang Technological University) | Soh, Yeng Chai (Nanyang Technological University)
To alleviate traffic congestion in urban areas, electronic toll collection (ETC) systems are deployed all over the world. Despite the merits, tolls are usually pre-determined and fixed from day to day, which fail to consider traffic dynamics and thus have limited regulation effect when traffic conditions are abnormal. In this paper, we propose a novel dynamic ETC (DyETC) scheme which adjusts tolls to traffic conditions in realtime. The DyETC problem is formulated as a Markov decision process (MDP), the solution of which is very challenging due to its 1) multi-dimensional state space, 2) multi-dimensional, continuous and bounded action space, and 3) time-dependent state and action values. Due to the complexity of the formulated MDP, existing methods cannot be applied to our problem. Therefore, we develop a novel algorithm, PG-beta, which makes three improvements to traditional policy gradient method by proposing 1) time-dependent value and policy functions, 2) Beta distribution policy function and 3) state abstraction. Experimental results show that, compared with existing ETC schemes, DyETC increases traffic volume by around 8%, and reduces travel time by around 14:6% during rush hour. Considering the total traffic volume in a traffic network, this contributes to a substantial increase to social welfare.
Zhao, Mengchen (Nanyang Technological University) | An, Bo (Nanyang Technological University) | Yu, Yaodong (Nanyang Technological University) | Liu, Sulin (Nanyang Technological University) | Pan, Sinno Jialin (Nanyang Technological University)
Multi-task learning (MTL) is a machine learning paradigm that improves the performance of each task by exploiting useful information contained in multiple related tasks. However, the relatedness of tasks can be exploited by attackers to launch data poisoning attacks, which has been demonstrated a big threat to single-task learning. In this paper, we provide the first study on the vulnerability of MTL. Specifically, we focus on multi-task relationship learning (MTRL) models, a popular subclass of MTL models where task relationships are quantized and are learned directly from training data. We formulate the problem of computing optimal poisoning attacks on MTRL as a bilevel program that is adaptive to arbitrary choice of target tasks and attacking tasks. We propose an efficient algorithm called PATOM for computing optimal attack strategies. PATOM leverages the optimality conditions of the subproblem of MTRL to compute the implicit gradients of the upper level objective function. Experimental results on real-world datasets show that MTRL models are very sensitive to poisoning attacks and the attacker can significantly degrade the performance of target tasks, by either directly poisoning the target tasks or indirectly poisoning the related tasks exploiting the task relatedness. We also found that the tasks being attacked are always strongly correlated, which provides a clue for defending against such attacks.
The market for selling reusable products (e.g., car rental, cloud services and network access resources) is growing rapidly over the last few years, where service providers maximize their revenues through setting optimal prices. While there has been lots of research on pricing optimization, existing works often ignore dynamic property of demand and the competition among providers. Thus, existing pricing solutions might be far from optimal in realistic markets. This paper provides the first study of service providers' dynamic pricing in consideration of market competition and makes three key contributions along this line. First, we propose a comprehensive model that takes into account the dynamic demand and interaction among providers, and formulate the optimal pricing policy in the competitive market as an equilibrium. Second, we propose an approximate Nash equilibrium to describe providers' behaviors, and design an efficient algorithm to compute the equilibrium which is guaranteed to converge. Third, we derive many properties of the model without any further constraints on demand functions, which can reduce the search space of policies in the algorithm. Finally, we conduct extensive experiments with different parameter settings, showing that the approximate equilibrium is very close to the Nash equilibrium and our proposed pricing policy outperforms existing strategies.
Pirate syndicates capturing tankers to siphon oil, causing an estimated cost of $5 billion a year, has become a serious security issue for maritime traffic. In response to the threat, coast guards and navies deploy patrol boats to protect international oil trade. However, given the vast area of the sea and the highly time and space dependent behaviors of both players, it remains a significant challenge to find efficient ways to deploy patrol resources. In this paper, we address the research challenges and provide four key contributions. First, we construct a Stackelberg model of the oil-siphoning problem based on incident reports of actual attacks; Second, we propose a compact formulation and a constraint generation algorithm, which tackle the exponentially growth of the defender’s and attacker’s strategy spaces, respectively, to compute efficient strategies of security agencies; Third, to further improve the scalability, we propose an abstraction method, which exploits the intrinsic similarity of defender’s strategy space, to solve extremely large-scale games; Finally, we evaluate our approaches through extensive simulations and a detailed case study with real ship traffic data. The results demonstrate that our approach achieves a dramatic improvement of scalability with modest influence on the solution quality and can scale up to realistic-sized problems.
It has been an open challenge for self-interested agents to make optimal sequential decisions in complex multiagent systems, where agents might achieve higher utility via collaboration. The Microsoft Malmo Collaborative AI Challenge (MCAC), which is designed to encourage research relating to various problems in Collaborative AI, takes the form of a Minecraft mini-game where players might work together to catch a pig or deviate from cooperation, for pursuing high scores to win the challenge. Various characteristics, such as complex interactions among agents, uncertainties, sequential decision making and limited learning trials all make it extremely challenging to find effective strategies. We present HogRider---the champion agent of MCAC in 2017 out of 81 teams from 26 countries. One key innovation of HogRider is a generalized agent type hypothesis framework to identify the behavior model of the other agents, which is demonstrated to be robust to observation uncertainty. On top of that, a second key innovation is a novel Q-learning approach to learn effective policies against each type of the collaborating agents. Various ideas are proposed to adapt traditional Q-learning to handle complexities in the challenge, including state-action abstraction to reduce problem scale, a warm start approach using human reasoning for addressing limited learning trials, and an active greedy strategy to balance exploitation-exploration. Challenge results show that HogRider outperforms all the other teams by a significant edge, in terms of both optimality and stability.
Reserve price is an effective tool for revenue maximization in ad auctions. The optimal reserve price depends on bidders' value distributions, which, however, are generally unknown to auctioneers. A common practice for auctioneers is to first collect information about the value distributions by a sampling procedure and then apply the reserve price estimated with the sampled bids to the following auctions. In order to maximize the total revenue over finite auctions, it is important for the auctioneer to find a proper sample size to trade off between the cost of the sampling procedure and the optimality of the estimated reserve price. We investigate the sample size optimization problem for Generalized Second Price auctions, which is the most widely-used mechanism in ad auctions, and make three main contributions along this line. First, we bound the revenue losses in the form of competitive ratio during and after sampling. Second, we formulate the problem of finding the optimal sample size as a non-convex mixed integer optimization problem. Then we characterize the properties of the problem and prove the uniqueness of the optimal sample size. Third, we relax the integer optimization problem to a continuous form and develop an efficient algorithm based on the properties to solve it. Experimental results show that our approach can significantly improve the revenue for the auctioneer in finitely repeated ad auctions.
Most existing models of Stackelberg security games ignore the underlying topology of the space in which targets and defence resources are located. As a result, allocation of resources is restricted to a discrete collection of exogenously defined targets. However, in many practical security settings, defense resources can be located on a continuous plane. Better defense solutions could therefore be potentially achieved by placing resources in a space outside of actual targets (e.g., between targets). To address this limitation, we propose a model called Security Game on a Plane (SGP) in which targets are distributed on a 2-dimensional plane, and security resources, to be allocated on the same plane, protect targets within a certain effective distance. We investigate the algorithmic aspects of SGP. We find that computing a strong Stackelberg equilibrium of an SGP is NP-hard even for zero-sum games, and these are inapproximable in general. On the positive side, we find an exact solution technique for general SGPs based on an existing approach, and develop a PTAS (polynomial-time approximation scheme) for zero-sum SGP to more fundamentally overcome the computational obstacle. Our experiments demonstrate the value of considering SGP and effectiveness of our algorithms.
The Man-In-The-Middle (MITM) attack is one of the most common attacks employed in the network hacking. MITM attackers can successfully invoke attacks such as denial of service (DoS) and port stealing, and lead to surprisingly harmful consequences for users in terms of both financial loss and security issues. The conventional defense approaches mainly consider how to detect and eliminate those attacks or how to prevent those attacks from being launched in the first place. This paper proposes a game-theoretic defense strategy from a different perspective, which aims at minimizing the loss that the whole system sustains given that the MITM attacks are inevitable. We model the interaction between the attacker and the defender as a Stackelberg security game and adopt the Strong Stackelberg Equilibrium (SSE) as the defender's strategy. Since the defender's strategy space is infinite in our model, we employ a novel method to reduce the searching space of computing the optimal defense strategy. Finally, we empirically evaluate our optimal defense strategy by comparing it with non-strategic defense strategies. The results indicate that our game-theoretic defense strategy significantly outperforms other non-strategic defense strategies in terms of decreasing the total losses against MITM attacks.