Oceania
Towards Automatic Personalized Content Generation for Platform Games
Shaker, Noor (IT University of Copenhagen) | Yannakakis, Georgios (IT University of Copenhagen) | Togelius, Julian (IT University of Copenhagen)
In this paper, we show that personalized levels can be auto- matically generated for platform games. We build on previ- ous work, where models were derived that predicted player experience based on features of level design and on playing styles. These models are constructed using preference learn- ing, based on questionnaires administered to players after playing different levels. The contributions of the current pa- per are (1) more accurate models based on a much larger data set; (2) a mechanism for adapting level design parameters to given players and playing style; (3) evaluation of this adap- tation mechanism using both algorithmic and human players. The results indicate that the adaptation mechanism effectively optimizes level design parameters for particular players.
An Automated Technique for Drafting Territories in the Board Game Risk
Gibson, Richard (University of Alberta) | Desai, Neesha (University of Alberta) | Zhao, Richard (University of Alberta)
In the standard rules of the board game Risk, players take turns selecting or "drafting" the 42 territories on the board until all territories are owned. We present a technique for drafting territories in Risk that combines the Monte Carlo tree search algorithm UCT with an automated evaluation function. Created through supervised machine learning, this function scores outcomes of drafts in order to shorten the length of a UCT simulation. Using this approach, we augment an existing bot for the computer game Lux Delux, a clone of Risk. Our drafting technique is shown to greatly improve performance against the strongest opponents supplied with Lux Delux. The evidence provided indicates that territory drafting is important to overall success in Risk.
A Comprehensive Survey of Data Mining-based Fraud Detection Research
Phua, Clifton, Lee, Vincent, Smith, Kate, Gayler, Ross
This survey paper categorises, compares, and summarises from almost all published technical and review articles in automated fraud detection within the last 10 years. It defines the professional fraudster, formalises the main types and subtypes of known fraud, and presents the nature of data evidence collected within affected industries. Within the business context of mining the data to achieve higher cost savings, this research presents methods and techniques together with their problems. Compared to all related reviews on fraud detection, this survey covers much more technical articles and is the only one, to the best of our knowledge, which proposes alternative data and solutions from related domains.
The LAMA Planner: Guiding Cost-Based Anytime Planning with Landmarks
LAMA is a classical planning system based on heuristic forward search. Its core feature is the use of a pseudo-heuristic derived from landmarks, propositional formulas that must be true in every solution of a planning task. LAMA builds on the Fast Downward planning system, using finite-domain rather than binary state variables and multi-heuristic search. The latter is employed to combine the landmark heuristic with a variant of the well-known FF heuristic. Both heuristics are cost-sensitive, focusing on high-quality solutions in the case where actions have non-uniform cost. A weighted A* search is used with iteratively decreasing weights, so that the planner continues to search for plans of better quality until the search is terminated. LAMA showed best performance among all planners in the sequential satisficing track of the International Planning Competition 2008. In this paper we present the system in detail and investigate which features of LAMA are crucial for its performance. We present individual results for some of the domains used at the competition, demonstrating good and bad cases for the techniques implemented in LAMA. Overall, we find that using landmarks improves performance, whereas the incorporation of action costs into the heuristic estimators proves not to be beneficial. We show that in some domains a search that ignores cost solves far more problems, raising the question of how to deal with action costs more effectively in the future. The iterated weighted A* search greatly improves results, and shows synergy effects with the use of landmarks.
Implicit Abstraction Heuristics
State-space search with explicit abstraction heuristics is at the state of the art of cost-optimal planning. These heuristics are inherently limited, nonetheless, because the size of the abstract space must be bounded by some, even if a very large, constant. Targeting this shortcoming, we introduce the notion of (additive) implicit abstractions, in which the planning task is abstracted by instances of tractable fragments of optimal planning. We then introduce a concrete setting of this framework, called fork-decomposition, that is based on two novel fragments of tractable cost-optimal planning. The induced admissible heuristics are then studied formally and empirically. This study testifies for the accuracy of the fork decomposition heuristics, yet our empirical evaluation also stresses the tradeoff between their accuracy and the runtime complexity of computing them. Indeed, some of the power of the explicit abstraction heuristics comes from precomputing the heuristic function offline and then determining h(s) for each evaluated state s by a very fast lookup in a ``database.'' By contrast, while fork-decomposition heuristics can be calculated in polynomial time, computing them is far from being fast. To address this problem, we show that the time-per-node complexity bottleneck of the fork-decomposition heuristics can be successfully overcome. We demonstrate that an equivalent of the explicit abstraction notion of a ``database'' exists for the fork-decomposition abstractions as well, despite their exponential-size abstract spaces. We then verify empirically that heuristic search with the ``databased" fork-decomposition heuristics favorably competes with the state of the art of cost-optimal planning.
Solving the Resource Constrained Project Scheduling Problem with Generalized Precedences by Lazy Clause Generation
Schutt, Andreas, Feydy, Thibaut, Stuckey, Peter J., Wallace, Mark G.
The technical report presents a generic exact solution approach for minimizing the project duration of the resource-constrained project scheduling problem with generalized precedences (Rcpsp/max). The approach uses lazy clause generation, i.e., a hybrid of finite domain and Boolean satisfiability solving, in order to apply nogood learning and conflict-driven search on the solution generation. Our experiments show the benefit of lazy clause generation for finding an optimal solutions and proving its optimality in comparison to other state-of-the-art exact and non-exact methods. The method is highly robust: it matched or bettered the best known results on all of the 2340 instances we examined except 3, according to the currently available data on the PSPLib. Of the 631 open instances in this set it closed 573 and improved the bounds of 51 of the remaining 58 instances.
Machine Learning Approaches for Modeling Spammer Behavior
Islam, Md. Saiful, Mahmud, Abdullah Al, Islam, Md. Rafiqul
Spam is commonly known as unsolicited or unwanted email messages in the Internet causing potential threat to Internet Security. Users spend a valuable amount of time deleting spam emails. More importantly, ever increasing spam emails occupy server storage space and consume network bandwidth. Keyword-based spam email filtering strategies will eventually be less successful to model spammer behavior as the spammer constantly changes their tricks to circumvent these filters. The evasive tactics that the spammer uses are patterns and these patterns can be modeled to combat spam. This paper investigates the possibilities of modeling spammer behavioral patterns by well-known classification algorithms such as Na\"ive Bayesian classifier (Na\"ive Bayes), Decision Tree Induction (DTI) and Support Vector Machines (SVMs). Preliminary experimental results demonstrate a promising detection rate of around 92%, which is considerably an enhancement of performance compared to similar spammer behavior modeling research.
Modeling Spammer Behavior: Na\"ive Bayes vs. Artificial Neural Networks
Islam, Md. Saiful, Khaled, Shah Mostafa, Farhan, Khalid, Rahman, Md. Abdur, Rahman, Joy
Addressing the problem of spam emails in the Internet, this paper presents a comparative study on Na\"ive Bayes and Artificial Neural Networks (ANN) based modeling of spammer behavior. Keyword-based spam email filtering techniques fall short to model spammer behavior as the spammer constantly changes tactics to circumvent these filters. The evasive tactics that the spammer uses are themselves patterns that can be modeled to combat spam. It has been observed that both Na\"ive Bayes and ANN are best suitable for modeling spammer common patterns. Experimental results demonstrate that both of them achieve a promising detection rate of around 92%, which is considerably an improvement of performance compared to the keyword-based contemporary filtering approaches.
An Empirical Study of Borda Manipulation
Davies, Jessica, Katsirelos, George, Narodystka, Nina, Walsh, Toby
We study the problem of coalitional manipulation in elections using the unweighted Borda rule. We provide empirical evidence of the manipulability of Borda elections in the form of two new greedy manipulation algorithms based on intuitions from the bin-packing and multiprocessor scheduling domains. Although we have not been able to show that these algorithms beat existing methods in the worst-case, our empirical evaluation shows that they significantly outperform the existing method and are able to find optimal manipulations in the vast majority of the randomly generated elections that we tested. These empirical results provide further evidence that the Borda rule provides little defense against coalitional manipulation.
Efficient Spectral Feature Selection with Minimum Redundancy
Zhao, Zheng (Arizona State University) | Wang, Lei (The Australian National University) | Liu, Huan (Arizona State University)
Spectral feature selection identifies relevant features by measuring their capability of preserving sample similarity. It provides a powerful framework for both supervised and unsupervised feature selection, and has been proven to be effective in many real-world applications. One common drawback associated with most existing spectral feature selection algorithms is that they evaluate features individually and cannot identify redundant features. Since redundant features can have significant adverse effect on learning performance, it is necessary to address this limitation for spectral feature selection. To this end, we propose a novel spectral feature selection algorithm to handle feature redundancy, adopting an embedded model. The algorithm is derived from a formulation based on a sparse multi-output regression with a L 2,1 -norm constraint. We conduct theoretical analysis on the properties of its optimal solutions, paving the way for designing an efficient path-following solver. Extensive experiments show that the proposed algorithm can do well in both selecting relevant features and removing redundancy.