If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."
However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …
Reserve price is an effective tool for revenue maximization in ad auctions. The optimal reserve price depends on bidders' value distributions, which, however, are generally unknown to auctioneers. A common practice for auctioneers is to first collect information about the value distributions by a sampling procedure and then apply the reserve price estimated with the sampled bids to the following auctions. In order to maximize the total revenue over finite auctions, it is important for the auctioneer to find a proper sample size to trade off between the cost of the sampling procedure and the optimality of the estimated reserve price. We investigate the sample size optimization problem for Generalized Second Price auctions, which is the most widely-used mechanism in ad auctions, and make three main contributions along this line. First, we bound the revenue losses in the form of competitive ratio during and after sampling. Second, we formulate the problem of finding the optimal sample size as a non-convex mixed integer optimization problem. Then we characterize the properties of the problem and prove the uniqueness of the optimal sample size. Third, we relax the integer optimization problem to a continuous form and develop an efficient algorithm based on the properties to solve it. Experimental results show that our approach can significantly improve the revenue for the auctioneer in finitely repeated ad auctions.
Most existing models of Stackelberg security games ignore the underlying topology of the space in which targets and defence resources are located. As a result, allocation of resources is restricted to a discrete collection of exogenously defined targets. However, in many practical security settings, defense resources can be located on a continuous plane. Better defense solutions could therefore be potentially achieved by placing resources in a space outside of actual targets (e.g., between targets). To address this limitation, we propose a model called Security Game on a Plane (SGP) in which targets are distributed on a 2-dimensional plane, and security resources, to be allocated on the same plane, protect targets within a certain effective distance. We investigate the algorithmic aspects of SGP. We find that computing a strong Stackelberg equilibrium of an SGP is NP-hard even for zero-sum games, and these are inapproximable in general. On the positive side, we find an exact solution technique for general SGPs based on an existing approach, and develop a PTAS (polynomial-time approximation scheme) for zero-sum SGP to more fundamentally overcome the computational obstacle. Our experiments demonstrate the value of considering SGP and effectiveness of our algorithms.
The Man-In-The-Middle (MITM) attack is one of the most common attacks employed in the network hacking. MITM attackers can successfully invoke attacks such as denial of service (DoS) and port stealing, and lead to surprisingly harmful consequences for users in terms of both financial loss and security issues. The conventional defense approaches mainly consider how to detect and eliminate those attacks or how to prevent those attacks from being launched in the first place. This paper proposes a game-theoretic defense strategy from a different perspective, which aims at minimizing the loss that the whole system sustains given that the MITM attacks are inevitable. We model the interaction between the attacker and the defender as a Stackelberg security game and adopt the Strong Stackelberg Equilibrium (SSE) as the defender's strategy. Since the defender's strategy space is infinite in our model, we employ a novel method to reduce the searching space of computing the optimal defense strategy. Finally, we empirically evaluate our optimal defense strategy by comparing it with non-strategic defense strategies. The results indicate that our game-theoretic defense strategy significantly outperforms other non-strategic defense strategies in terms of decreasing the total losses against MITM attacks.
With the increasing popularity of location-aware social media applications, Point-of-Interest (POI) recommendation has recently been extensively studied. However, most of the existing studies explore from the users' perspective, namely recommending POIs for users. In contrast, we consider a new research problem of predicting users who will visit a given POI in a given future period. The challenge of the problem lies in the difficulty to effectively learn POI sequential transition and user preference, and integrate them for prediction. In this work, we propose a new latent representation model POI2Vec that is able to incorporate the geographical influence, which has been shown to be very important in modeling user mobility behavior. Note that existing representation models fail to incorporate the geographical influence. We further propose a method to jointly model the user preference and POI sequential transition influence for predicting potential visitors for a given POI. We conduct experiments on 2 real-world datasets to demonstrate the superiority of our proposed approach over the state-of-the-art algorithms for both next POI prediction and future user prediction.
Yang, Shangdong (Nanjing University) | Gao, Yang (Nanjing University) | An, Bo (Nanyang Technological University) | Wang, Hao (Nanjing University) | Chen, Xingguo (Nanjing University of Posts and Telecommunications)
There are two classes of average reward reinforcement learning (RL) algorithms: model-based ones that explicitly maintain MDP models and model-free ones that do not learn such models. Though model-free algorithms are known to be more efficient, they often cannot converge to optimal policies due to the perturbation of parameters. In this paper, a novel model-free algorithm is proposed, which makes use of constant shifting values (CSVs) estimated from prior knowledge. To encourage exploration during the learning process, the algorithm constantly subtracts the CSV from the rewards. A terminating condition is proposed to handle the unboundedness of Q-values caused by such substraction. The convergence of the proposed algorithm is proved under very mild assumptions. Furthermore, linear function approximation is investigated to generalize our method to handle large-scale tasks. Extensive experiments on representative MDPs and the popular game Tetris show that the proposed algorithms significantly outperform the state-of-the-art ones.
Xiong, Yanhai (Nanyang Technological University) | Gan, Jiarui (University of Chinese Academy of Sciences) | An, Bo (Nanyang Technological University) | Miao, Chunyan (Nanyang Technological University) | Bazzan, Ana L. C. (Universidade Federal do Rio Grande do Sul)
Many countries like Singapore are planning to introduce Electric Vehicles (EVs) to replace traditional vehicles to reduce air pollution and improve energy efficiency. The rapid development of EVs calls for efficient deployment of charging stations both for the convenience of EVs and maintaining the efficiency of the road network. Unfortunately, existing work makes unrealistic assumption on EV drivers' charging behaviors and focus on the limited mobility of EVs. This paper studies the Charging Station PLacement (CSPL) problem, and takes into consideration 1) EV drivers' strategic behaviors to minimize their charging cost, and 2) the mutual impact of EV drivers' strategies on the traffic conditions of the road network and service quality of charging stations. We first formulate the CSPL problem as a bilevel optimization problem, which is subsequently converted to a single-level optimization problem by exploiting structures of the EV charging game played by EV drivers. Properties of CSPL problem are analyzed and an algorithm called OCEAN is proposed to compute the optimal allocation of charging stations. We further propose a heuristic algorithm OCEAN-C to speed up OCEAN. Experimental results show that the proposed algorithms significantly outperform baseline methods.
Stackelberg security games have been widely deployed in recent years to schedule security resources. An assumption in most existing security game models is that one security resource assigned to a target only protects that target. However, in many important real-world security scenarios, when a resource is assigned to a target, it exhibits protection externalities: that is, it also protects other “neighbouring” targets. We investigate such Security Games with Protection Externalities (SPEs). First, we demonstrate that computing a strong Stackelberg equilibrium for an SPE is NP-hard, in contrast with traditional Stackelberg security games which can be solved in polynomial time. On the positive side, we propose a novel column generation based approach—CLASPE—to solve SPEs. CLASPE features the following novelties: 1) a novel mixed-integer linear programming formulation for the slave problem; 2) an extended greedy approach with a constant-factor approximation ratio to speed up the slave problem; and 3) a linear-scale linear programming that efficiently calculates the upper bounds of target-defined subproblems for pruning. Our experimental evaluation demonstrates that CLASPE enable us to scale to realistic-sized SPE problem instances.
High profile large scale public events are attractive targets for terrorist attacks. The recent Boston Marathon bombings on April 15, 2013 have further emphasized the importance of protecting public events. The security challenge is exacerbated by the dynamic nature of such events: e.g., the impact of an attack at different locations changes over time as the Boston marathon participants and spectators move along the race track. In addition, the defender can relocate security resources among potential attack targets at any time and the attacker may act at any time during the event. This paper focuses on developing efficient patrolling algorithms for such dynamic domains with continuous strategy spaces for both the defender and the attacker. We aim at computing optimal pure defender strategies, since an attacker does not have an opportunity to learn and respond to mixed strategies due to the relative infrequency of such events. We propose SCOUT-A, which makes assumptions on relocation cost, exploits payoff representation and computes optimal solutions efficiently. We also propose SCOUT-C to compute the exact optimal defender strategy for general cases despite the continuous strategy spaces. SCOUT-C computes the optimal defender strategy by constructing an equivalent game with discrete defender strategy space, then solving the constructed game. Experimental results show that both SCOUT-A and SCOUT-C significantly outperform other existing strategies.
Stackelberg games form the core of a number of tools deployed for computing optimal patrolling strategies in adversarial domains, such as the US Federal Air Marshall Service and the US Coast Guard. In traditional Stackelberg security game models the attacker knows only the probability that each target is covered by the defender, but is oblivious to the detailed timing of the coverage schedule. In many real-world situations, however, the attacker can observe the current location of the defender and can exploit this knowledge to reason about the defender’s future moves. We show that this general modeling framework can be captured using adversarial patrolling games (APGs) in which the defender sequentially moves between targets, with moves constrained by a graph, while the attacker can observe the defender’s current location and his (stochastic) policy concerning future moves. We offer a very general model of infinite-horizon discounted adversarial patrolling games. Our first contribution is to show that defender policies that condition only on the previous defense move (i.e., Markov stationary policies) can be arbitrarily suboptimal for general APGs. We then offer a mixed-integer non-linear programming (MINLP) formulation for computing optimal randomized policies for the defender that can condition on history of bounded, but arbitrary, length, as well as a mixed-integer linear programming (MILP) formulation to approximate these, with provable quality guarantees. Additionally, we present a non-linear programming (NLP) formulation for solving zero-sum APGs. We show experimentally that MILP significantly outperforms the MINLP formulation, and is, in turn, significantly outperformed by the NLP specialized to zero-sum games.