Rule-Based Reasoning
Using Association Rules for Better Treatment of Missing Values
Bashir, Shariq, Razzaq, Saad, Maqbool, Umer, Tahir, Sonya, Baig, Abdul Rauf
The quality of training data for knowledge discovery in databases (KDD) and data mining depends upon many factors, but handling missing values is considered to be a crucial factor in overall data quality. Today real world datasets contains missing values due to human, operational error, hardware malfunctioning and many other factors. The quality of knowledge extracted, learning and decision problems depend directly upon the quality of training data. By considering the importance of handling missing values in KDD and data mining tasks, in this paper we propose a novel Hybrid Missing values Imputation Technique (HMiT) using association rules mining and hybrid combination of k-nearest neighbor approach. To check the effectiveness of our HMiT missing values imputation technique, we also perform detail experimental results on real world datasets. Our results suggest that the HMiT technique is not only better in term of accuracy but it also take less processing time as compared to current best missing values imputation technique based on k-nearest neighbor approach, which shows the effectiveness of our missing values imputation technique.
Introducing Partial Matching Approach in Association Rules for Better Treatment of Missing Values
Bashir, Shariq, Razzaq, Saad, Maqbool, Umer, Tahir, Sonya, Baig, Abdul Rauf
Handling missing values in training datasets for constructing learning models or extracting useful information is considered to be an important research task in data mining and knowledge discovery in databases. In recent years, lot of techniques are proposed for imputing missing values by considering attribute relationships with missing value observation and other observations of training dataset. The main deficiency of such techniques is that, they depend upon single approach and do not combine multiple approaches, that why they are less accurate. To improve the accuracy of missing values imputation, in this paper we introduce a novel partial matching concept in association rules mining, which shows better results as compared to full matching concept that we described in our previous work. Our imputation technique combines the partial matching concept in association rules with k-nearest neighbor approach. Since this is a hybrid technique, therefore its accuracy is much better than as compared to those techniques which depend upon single approach. To check the efficiency of our technique, we also provide detail experimental results on number of benchmark datasets which show better results as compared to previous approaches.
Identification of Pleonastic It Using the Web
Li, Y., Musilek, P., Reformat, M., Wyard-Scott, L.
In a significant minority of cases, certain pronouns, especially the pronoun it, can be used without referring to any specific entity. This phenomenon of pleonastic pronoun usage poses serious problems for systems aiming at even a shallow understanding of natural language texts. In this paper, a novel approach is proposed to identify such uses of it: the extrapositional cases are identified using a series of queries against the web, and the cleft cases are identified using a simple set of syntactic rules. The system is evaluated with four sets of news articles containing 679 extrapositional cases as well as 78 cleft constructs. The identification results are comparable to those obtained by human efforts.
Behavior Bounding: An Efficient Method for High-Level Behavior Comparison
In this paper, we explore methods for comparing agent behavior with human behavior to assist with validation. Our exploration begins by considering a simple method of behavior comparison. Motivated by shortcomings in this initial approach, we introduce behavior bounding, an automated model-based approach for comparing behavior that is inspired, in part, by Mitchell's Version Spaces. We show that behavior bounding can be used to compactly represent both human and agent behavior. We argue that relatively low amounts of human effort are required to build, maintain, and use the data structures that underlie behavior bounding, and we provide a theoretical basis for these arguments using notions of PAC Learnability. Next, we show empirical results indicating that this approach is effective at identifying differences in certain types of behaviors and that it performs well when compared against our initial benchmark methods. Finally, we demonstrate that behavior bounding can produce information that allows developers to identify and fix problems in an agent's behavior much more efficiently than standard debugging techniques.
Logical Algorithms meets CHR: A meta-complexity result for Constraint Handling Rules with rule priorities
This paper investigates the relationship between the Logical Algorithms language (LA) of Ganzinger and McAllester and Constraint Handling Rules (CHR). We present a translation schema from LA to CHR-rp: CHR with rule priorities, and show that the meta-complexity theorem for LA can be applied to a subset of CHR-rp via inverse translation. Inspired by the high-level implementation proposal for Logical Algorithm by Ganzinger and McAllester and based on a new scheduling algorithm, we propose an alternative implementation for CHR-rp that gives strong complexity guarantees and results in a new and accurate meta-complexity theorem for CHR-rp. It is furthermore shown that the translation from Logical Algorithms to CHR-rp combined with the new CHR-rp implementation, satisfies the required complexity for the Logical Algorithms meta-complexity result to hold.
AAAI 2008 Spring Symposia Reports
Balduccini, Marcello (Eastman Kodak Company) | Baral, Chitta (Arizona State University) | Brodaric, Boyan (Geological Survey of Canada) | Colton, Simon (Imperial College, London) | Fox, Peter (National Center for Atmospheric Research) | Gutelius, David (SRI International) | Hinkelmann, Knut (University of Applied Sciences Northwestern Switzerland) | Horswill, Ian (Northwestern University) | Huberman, Bernardo (HP Labs) | Hudlicka, Eva (Psychometrix Associates) | Lerman, Kristina (USC Information Sciences Institute) | Lisetti, Christine (Florida International University) | McGuinness, Deborah L. (Rensselaer Polytechnic Institute) | Maher, Mary Lou (National Science Foundation) | Musen, Mark A. (Stanford University) | Sahami, Mehran (Stanford University) | Sleeman, Derek (University of Aberdeen) | Thönssen, Barbara (University of Applied Sciences Northwestern Switzerland) | Velasquez, Juan D. (MIT CSAIL) | Ventura, Dan (Brigham Young University)
The titles of the eight symposia were as follows: (1) AI Meets Business Rules and Process Management, (2) Architectures for Intelligent Theory-Based Agents, (3) Creative Intelligent Systems, (4) Emotion, Personality, and Social Behavior, (5) Semantic Scientific Knowledge Integration, (6) Social Information Processing, (7) Symbiotic Relationships between Semantic Web and Knowledge Engineering, (8) Using AI to Motivate Greater Participation in Computer Science The goal of the AI Meets Business Rules and Process Management AAAI symposium was to investigate the various approaches and standards to represent business rules, business process management and the semantic web with respect to expressiveness and reasoning capabilities. The Semantic Scientific Knowledge Symposium was interested in bringing together the semantic technologies community with the scientific information technology community in an effort to build the general semantic science information community. The Social Information Processing's goal was to investigate computational and analytic approaches that will enable users to harness the efforts of large numbers of other users to solve a variety of information processing problems, from discovering high-quality content to managing common resources. The purpose of the Using AI to Motivate Greater Participation in Computer Science symposium was to identify ways that topics in AI may be used to motivate greater student participation in computer science by highlighting fun, engaging, and intellectually challenging developments in AI-related curriculum at a number of educational levels.
Reconstructing True Wrong Inductions
Ganascia, Jean-Gabriel G (University Pierre and Marie Curie)
There have been many erroneous pre-scientific and common sense inductions. We want to understand why people believe in wrong theories. Our hypothesis is that mistaken inductions are due not only to the lack of facts, but also to the poor description of existing facts and to implicit knowledge which is transmitted socially. This paper presents several experiments the aim of which is to validate this hypothesis by using machine learning and data mining techniques to simulate the way people build erroneous theories from observations.
Custom DU: A Web-Based Business User-Driven Automated Underwriting System
Krovvidy, Srinivas (Fannie Mae)
Custom DU is an automated underwriting system that enables mortgage lenders to build their own business rules that facilitate assessing borrower eligibility for different mortgage products. By means of the user interface, lenders can also customize their underwriting findings reports, test the rules that they have defined, and publish changes to business rules on a real-time basis, all without any software modifications. The user interface enforces structure and consistency, enabling business users to focus on their underwriting guidelines when converting their business policy to rules. Using Custom DU, lenders can create different rule sets for their products and assign them to different channels of the business, allowing for centralized control of underwriting policies and procedures--even if lenders have decentralized operations.
Introduction to the Special Issue on Innovative Applications of Artificial Intelligence
Cheetham, William (General Electric Global Research Center) | Goker, Mehmet H. (PricewaterhouseCooper)
In this editorial we introduce the articles published in this special AI Magazine issue on innovative applications of artificial intelligence. Discussed are a pick-pack-and-ship warehouse-management system, a neural network in the fishing industry, the use of AI to help mobile phone users, building business rules in the mortgage lending business, automating the processing of immigration forms, and the use of the semantic web to provide access to observational datasets.
Custom DU: A Web-Based Business User-Driven Automated Underwriting System
Krovvidy, Srinivas (Fannie Mae)
Custom DU is an automated underwriting system that enables mortgage lenders to build their own business rules that facilitate assessing borrower eligibility for different mortgage products. Developed by Fannie Mae, Custom DU has been used since 2004 by several lenders to automate the underwriting of numerous mortgage products. Custom DU uses rule specification language techniques and a web-based, user-friendly interface for implementing business rules that represent business policy. By means of the user interface, lenders can also customize their underwriting findings reports, test the rules that they have defined, and publish changes to business rules on a real-time basis, all without any software modifications. The user interface enforces structure and consistency, enabling business users to focus on their underwriting guidelines when converting their business policy to rules. Once lenders have created their rules, loans are routed to the appropriate rule sets, and customized, but consistent, results are always returned to the lender. Using Custom DU, lenders can create different rule sets for their products and assign them to different channels of the business, allowing for centralized control of underwriting policies and procedures—even if lenders have decentralized operations.