Europe
Not So Naive Online Bayesian Spam Filter
Su, Baojun (Zhejiang University) | Xu, Congfu (Zhejiang University)
Spam filtering, as a key problem in electronic communication, has drawn significant attention due to increasingly huge amounts of junk email on the Internet. Content-based filtering is one reliable method in combating with spammers' changing tactics. Naive Bayes (NB) is one of the earliest content-based machine learning methods both in theory and practice in combating with spammers, which is easy to implement while can achieve considerable accuracy. In this paper, the traditional online Bayesian classifier are enhanced by two ways. First, from theory's point of view, we devise a self-adaptive mechanism to gradually weaken the assumption of independence required by original NB in the online training process, and as a result of that our NSNB is no longer ``naive''. Second, we propose other engineering ways to make the classifier more robust and accuracy. The experiment results show that our NSNB does give state-of-the-art classification performance on online spam filtering on large benchmark data sets while it is extremely fast and takes up little memory in comparison with other statistical methods.
A Data-Mining Approach to 3D Realistic Render Setup Assistance
Morcillo, Carlos Gonzalez (University of Castilla-La Mancha) | Lopez, Lorenzo Manuel Lopez (University of Castilla-La Mancha) | Sanchez, Jose Jesus Castro (University of Castilla-La Mancha) | Moser, Bernhard (Software Competence Center GmbH)
Realistic rendering is the process of generating a 2D image from an abstract description of a 3D scene, aiming at achieving the quality of a photo. The quality of the generated image depends on the accuracy with which the employed render method simulates the behaviour of the light particles through the scene. According to the current practice, it is up to the user to choose optimal settings of input parameters for these methods in terms of time-efficiency, as well as image quality. This is an iterative trial and error process, even for expert users. This paper describes a novel approach based on techniques from the field of data mining and genetic computing to assist the user in the selection of render parameters. Experimental results are presented which show the benefits of this approach.
Trading Robustness for Privacy in Decentralized Recommender Systems
Cheng, Zunping (University College Dublin) | Hurley, Neil (University College Dublin)
Collaborative filtering (CF) recommender systems are very popular and successful in commercial application fields. One end-user concern is the privacy of the personal data required by such systems in order to make personalized recommendations. Recently, peer-to-peer decentralized architectures have been proposed to address this privacy issue. On the other hand system managers must be concerned about system robustness. In particular, it has been shown that recommender systems are vulnerable to profile injection, although model-based CF algorithms show greater stability against malicious attacks that have been studied in the state-of-the-art. In this paper we generalize the generic model for decentralized recommendation and discuss the trade-off between robustness and privacy. In this context, we argue that exposing knowledge of the model parameters allows new, highly effective, model-based attack strategies to be considered. We conclude that the security concerns of privacy and robustness stand in opposition to each other and are difficult to satisfy simultaneously.
Practical Attacks Against Authorship Recognition Techniques
Brennan, Michael Robert (Drexel University) | Greenstadt, Rachel (Drexel University)
The use of statistical AI techniques in authorship recognition (or stylometry) has contributed to literary and historical breakthroughs. These successes have led to the use of these techniques in criminal investigations and prosecutions. However, few have studied adversarial attacks and their devastating effect on the robustness of existing classification methods. This paper presents a framework for adversarial attacks including obfuscation attacks, where a subject attempts to hide their identity imitation attacks, where a subject attempts to frame another subject by imitating their writing style. The major contribution of this research is that it demonstrates that both attacks work very well. The obfuscation attack reduces the effectiveness of the techniques to the level of random guessing and the imitation attack succeeds with 68-91% probability depending on the stylometric technique used. These results are made more significant by the fact that the experimental subjects were unfamiliar with stylometric techniques, without specialized knowledge in linguistics, and spent little time on the attacks. This paper also provides another significant contribution to the field in using human subjects to empirically validate the claim of high accuracy for current techniques (without attacks) by reproducing results for three representative stylometric methods.
Enabling Data Quality with Lightweight Ontologies
Bidlack, Clint R. (ActivePrime Inc.)
As the volume and interconnectedness of corporate data grows, data quality is becoming a business competency essential to success. Existing methods for managing data quality do not scale up to large volumes of data in a way that is directly manageable by the owner of the data. For the past two years a new breed of data quality products, built on applied AI techniques, are empowering non-technical users. Over 150 businesses are benefiting from these products including NASDAQ, Visa, Experian, Oracle, Fidelity, Bank of America, Volvo, Dell, Sabic, and Dassault Systems. The applied AI techniques described include lightweight ontologies to efficiently find inexact textual matches in large data sets.
Modeling self-organizing traffic lights with elementary cellular automata
Gershenson, Carlos, Rosenblueth, David A.
There have been several highway traffic models proposed based on cellular automata. The simplest one is elementary cellular automaton rule 184. We extend this model to city traffic with cellular automata coupled at intersections using only rules 184, 252, and 136. The simplicity of the model offers a clear understanding of the main properties of city traffic and its phase transitions. We use the proposed model to compare two methods for coordinating traffic lights: a green-wave method that tries to optimize phases according to expected flows and a self-organizing method that adapts to the current traffic conditions. The self-organizing method delivers considerable improvements over the green-wave method. For low densities, the self-organizing method promotes the formation and coordination of platoons that flow freely in four directions, i.e. with a maximum velocity and no stops. For medium densities, the method allows a constant usage of the intersections, exploiting their maximum flux capacity. For high densities, the method prevents gridlocks and promotes the formation and coordination of "free-spaces" that flow in the opposite direction of traffic.
Optimal Crops Selection using Multiobjective Evolutionary Algorithms
Brunelli, Ricardo (National University of Asuncion) | Lücken, Christian von (National University of Asuncion)
Farm managers have to deal with many conflicting objectives when planning which crop to cultivate. Soil characteristics are extremely important when determining yield potential. Fertilization and liming are commonly used to adapt soils to the nutritional requirements of the crops to be cultivated. Planting the crop that will best fit the soil characteristics is an interesting alternative to minimize the need for soil treatment, reducing costs and potential environmental damages. In addition, farmers usually look for investments that offer the greatest potential earnings with the least possible risks. According to the objectives to be considered the crop selection problem may be difficult to solve using traditional tools. Therefore, this work proposes an approach based on Multiobjective Evolutionary Algorithms to help in the selection of an appropriate cultivation plan considering five crop alternatives and five objectives simultaneously.
Tactical Language and Culture Training Systems: Using AI to Teach Foreign Languages and Cultures
Johnson, W. Lewis (Alelo) | Valente, Andre (Alelo)
The Tactical Language and Culture Training System (TLCTS) helps people quickly acquire communicative skills in foreign languages and cultures. More than 40,000 learners worldwide have used TLCTS courses. TLCTS utilizes artificial intelligence technologies during the authoring process, and at run time to process learner speech, engage in dialog, and evaluate and assess learner performance. This paper describes the architecture of TLCTS and the artificial intelligence technologies that it employs, and presents results from multiple evaluation studies that demonstrate the benefits of learning foreign language and culture using this approach.
An AI Framework to Teach English as a Foreign Language: CSIEC
Jia, Jiyou (Peking University)
CSIEC (Computer Simulation in Educational Communication), is not only an intelligent web-based human-computer dialogue system with natural language for English instruction, but also a learning assessment system for learners and teachers. Its multiple functions—including grammar-based gap filling exercises, scenario show, free chatting and chatting on a given topic—can satisfy the various requirements for students with different backgrounds and learning abilities. After a brief explanation of the conception of our dialogue system, as well as a survey of related works, we will illustrate the system structure, and describe its pedagogical functions with the underlying AI techniques in detail such as NLP and rule-based reasoning. We will summarize the free Internet usage within a six month period and its integration into English classes in universities and middle schools. The evaluation findings about the class integration show that the chatting function has been improved and frequently utilized by the users, and the application of the CSIEC system on English instruction can motivate the learners to practice English and enhance their learning process. Finally, we will conclude with potential improvements.