Not enough data to create a plot.
Try a different view from the menu above.
Information Technology
Machine Learning Approaches for Modeling Spammer Behavior
Islam, Md. Saiful, Mahmud, Abdullah Al, Islam, Md. Rafiqul
Spam is commonly known as unsolicited or unwanted email messages in the Internet causing potential threat to Internet Security. Users spend a valuable amount of time deleting spam emails. More importantly, ever increasing spam emails occupy server storage space and consume network bandwidth. Keyword-based spam email filtering strategies will eventually be less successful to model spammer behavior as the spammer constantly changes their tricks to circumvent these filters. The evasive tactics that the spammer uses are patterns and these patterns can be modeled to combat spam. This paper investigates the possibilities of modeling spammer behavioral patterns by well-known classification algorithms such as Na\"ive Bayesian classifier (Na\"ive Bayes), Decision Tree Induction (DTI) and Support Vector Machines (SVMs). Preliminary experimental results demonstrate a promising detection rate of around 92%, which is considerably an enhancement of performance compared to similar spammer behavior modeling research.
An Influence Diagram-Based Approach for Estimating Staff Training in Software Industry
Jeet, Kawal, Mago, Vijay Kumar, Prasad, Bhanu, Minhas, Rajinder Singh
The successful completion of a software development process depends on the analytical capability and foresightedness of the project manager. For the project manager, the main intriguing task is to manage the risk factors as they adversely influence the completion deadline. One such key risk factor is staff training. The risk of this factor can be avoided by pre-judging the amount of training required by the staff. So, a procedure is required to help the project manager make this decision. This paper presents a system that uses influence diagrams to implement the risk model to aid decision making. The system also considers the cost of conducting the training, based on various risk factors such as, (i) Lack of experience with project software; (ii) Newly appointed staff; (iii) Staff not well versed with the required quality standards; and (iv) Lack of experience with project environment. The system provides estimated requirement details for staff training at the beginning of a software development project.
Large Margin Multiclass Gaussian Classification with Differential Privacy
Pathak, Manas A., Raj, Bhiksha
As increasing amounts of sensitive personal information is aggregated into data repositories, it has become important to develop mechanisms for processing the data without revealing information about individual data instances. The differential privacy model provides a framework for the development and theoretical analysis of such mechanisms. In this paper, we propose an algorithm for learning a discriminatively trained multi-class Gaussian classifier that satisfies differential privacy using a large margin loss function with a perturbed regularization term. We present a theoretical upper bound on the excess risk of the classifier introduced by the perturbation.
Collusion Detection in Online Bridge
Yan, Jeff (Newcastle University)
Collusion is a major unsolved security problem in online bridge: by illicitly exchanging card information over the telephone, instant messenger or the like, cheaters can gain huge advantages over honest players. It is very hard if not impossible to prevent collusion from happening. Instead, we motivate an AI-based detection approach and discuss its challenges. We challenge the AI community to create automated methods for detecting collusive traces left in game records with an accuracy that can be achieved by human masters.
Predicting the Importance of Newsfeed Posts and Social Network Friends
Paek, Tim (Microsoft Research) | Gamon, Michael (Microsoft Research) | Counts, Scott (Microsoft Research) | Chickering, David Maxwell (Microsoft Research) | Dhesi, Aman (Indian Institute of Technology Kanpur)
As users of social networking websites expand their network of friends, they are often flooded with newsfeed posts and status updates, most of which they consider to be "unimportant" and not newsworthy. In order to better understand how people judge the importance of their newsfeed, we conducted a study in which Facebook users were asked to rate the importance of their newsfeed posts as well as their friends. We learned classifiers of newsfeed and friend importance to identify predictive sets of features related to social media properties, the message text, and shared background information. For classifying friend importance, the best performing model achieved 85% accuracy and 25% error reduction. By leveraging this model for classifying newsfeed posts, the best newsfeed classifier achieved 64% accuracy and 27% error reduction.
Surveillance of Parimutuel Wagering Integrity Using Expert Systems and Machine Learning
Freedman, Roy Stuart (Inductive Solutions, Inc.) | Sobkowski, Isidore (Advanced Monitoring Systems, Inc.)
Parimutuel wagering is a significant source of revenue for many state governments. MonitorPlus is a surveillance system for parimutuel operators and regulators. Using industry expertise and best practices, MonitorPlus examines each and every wager and account transaction for evidence of fraud, crime, and money laundering. Alerts are generated in real-time. In forensic discovery mode, MonitorPlus is designed to collaborate with skilled analysts to discover more complex suspicious wagering patterns. MonitorPlus utilizes machine learning, so its risk profiles are current: its knowledge base improves with time. Each alert is accompanied by an automatically generated, rule-based explanation. This is critically important if an event rises to the level where legal action is required. Our development and deployment strategy is based on a new paradigm of a secure surveillance utility, where real-time alerts and dataintensive forensics support multiple regulatory jurisdictions. We believe this surveillance paradigm can be applied to other application domains such as lotteries, casinos, online gaming, and financial services.
UserRec: A User Recommendation Framework in Social Tagging Systems
Zhou, Tom Chao (The Chinese University of Hong Kong) | Ma, Hao (The Chinese University of Hong Kong) | Lyu, Michael R. (The Chinese University of Hong Kong) | King, Irwin (The Chinese University of Hong Kong)
Social tagging systems have emerged as an effective way for users to annotate and share objects on the Web. However, with the growth of social tagging systems, users are easily overwhelmed by the large amount of data and it is very difficult for users to dig out information that he/she is interested in. Though the tagging system has provided interest-based social network features to enable the user to keep track of other users' tagging activities, there is still no automatic and effective way for the user to discover other users with common interests. In this paper, we propose a User Recommendation (UserRec) framework for user interest modeling and interest-based user recommendation, aiming to boost information sharing among users with similar interests. Our work brings three major contributions to the research community: (1) we propose a tag-graph based community detection method to model the users' personal interests, which are further represented by discrete topic distributions; (2) the similarity values between users' topic distributions are measured by Kullback-Leibler divergence (KL-divergence), and the similarity values are further used to perform interest-based user recommendation; and (3) by analyzing users' roles in a tagging system, we find users' roles in a tagging system are similar to Web pages in the Internet. Experiments on tagging dataset of Web pages (Yahoo!~Delicious) show that UserRec outperforms other state-of-the-art recommender system approaches.
Trust Models and Con-Man Agents: From Mathematical to Empirical Analysis
Salehi-Abari, Amirali (Carleton University) | White, Tony (Carleton University)
Recent work has demonstrated that several trust and reputation models can be exploited by malicious agents with cyclical behaviour. In each cycle, the malicious agent with cyclical behaviour first regains a high trust value after a number of cooperations and then abuses its gained trust by engaging in a bad transaction. Using a game theoretic formulation, Salehi-Abari and White have proposed the AER model that is resistant to exploitation by cyclical behaviour. Their simulation results imply that FIRE, Regret, and a model due to Yu and Singh, can always be exploited with an appropriate value for the period of cyclical behaviour. Furthermore, their results demonstrate that this is not so for the proposed adaptive scheme. This paper provides a mathematical analysis of the properties of five trust models when faced with cyclical behaviour of malicious agents. Three main results are proven. First, malicious agents can always select a cycle period that allows them to exploit the four models of FIRE, Regret, Probabilistic models, and Yu and Singh indefinitely. Second, malicious agents cannot select a single, finite cycle period that allows them to exploit the AER model forever. Finally, the number of cooperations required to achieve a given trust value increases monotonically with each cycle. In addition to the mathematical analysis, this paper empirically shows how malicious agents can use the theorems proven in this paper to mount efficient attacks on trust models.
Automated Channel Abstraction for Advertising Auctions
Walsh, William E. (CombineNet) | Boutilier, Craig (University of Toronto) | Sandholm, Tuomas (Carnegie Mellon University) | Shields, Rob (CombineNet) | Nemhauser, George (Georgia Institute of Technology) | Parkes, David C. (Harvard University)
The use of simple auction mechanisms like the GSP in online advertising can lead to significant loss of efficiency and revenue when advertisers have rich preferences — even simple forms of expressiveness like budget constraints can lead to suboptimal outcomes. While the optimal allocation of inventory can provide greater efficiency and revenue, natural formulations of the underlying optimization problems grow exponentially in the number of features of interest, presenting a key practical challenge. To address this problem, we propose a means for automatically partitioning inventory into abstract channels so that the least relevant features are ignored. Our approach, based on LP/MIP column and constraint generation, dramatically reduces the size of the problem, thus rendering optimization computationally feasible at practical scales. Our algorithms allow for principled tradeoffs between tractability and solution quality. Numerical experiments demonstrate the computational practicality of our approach as well as the quality of the resulting abstractions.
Visual Contextual Advertising: Bringing Textual Advertisements to Images
Chen, Yuqiang (Shanghai Jiao Tong University) | Jin, Ou (Shanghai Jiao Tong University) | Xue, Gui-Rong (Shanghai Jiao Tong University) | Chen, Jia (Shanghai Jiao Tong University) | Yang, Qiang (Hong Kong University of Science and Technology)
Advertising in the case of textual Web pages has been studied extensively by many researchers. However, with the increasing amount of multimedia data such as image, audio and video on the Web, the need for recommending advertisement for the multimedia data is becoming a reality. In this paper, we address the novel problem of visual contextual advertising, which is to directly advertise when users are viewing images which do not have any surrounding text. A key challenging issue of visual contextual advertising is that images and advertisements are usually represented in image space and word space respectively, which are quite different with each other inherently. As a result, existing methods for Web page advertising are inapplicable since they represent both Web pages and advertisement in the same word space. In order to solve the problem, we propose to exploit the social Web to link these two feature spaces together. In particular, we present a unified generative model to integrate advertisements, words and images. Specifically, our solution combines two parts in a principled approach: First, we transform images from a image feature space to a word space utilizing the knowledge from images with annotations from social Web. Then, a language model based approach is applied to estimate the relevance between transformed images and advertisements. Moreover, in this model, the probability of recommending an advertisement can be inferred efficiently given an image, which enables potential applications to online advertising.