Rule-Based Reasoning
Rule-based query answering method for a knowledge base of economic crimes
We present a description of the PhD thesis which aims to propose a rule-based query answering method for relational data. In this approach we use an additional knowledge which is represented as a set of rules and describes the source data at concept (ontological) level. Queries are posed in the terms of abstract level. We present two methods. The first one uses hybrid reasoning and the second one exploits only forward chaining. These two methods are demonstrated by the prototypical implementation of the system coupled with the Jess engine. Tests are performed on the knowledge base of the selected economic crimes: fraudulent disbursement and money laundering.
TALplanner in IPC-2002: Extensions and Control Rules
TALplanner is a forward-chaining planner that relies on domain knowledge in the shape of temporal logic formulas in order to prune irrelevant parts of the search space. TALplanner recently participated in the third International Planning Competition, which had a clear emphasis on increasing the complexity of the problem domains being used as benchmark tests and the expressivity required to represent these domains in a planning system. Like many other planners, TALplanner had support for some but not all aspects of this increase in expressivity, and a number of changes to the planner were required. After a short introduction to TALplanner, this article describes some of the changes that were made before and during the competition. We also describe the process of introducing suitable domain knowledge for several of the competition domains.
Expert-Guided Subgroup Discovery: Methodology and Application
This paper presents an approach to expert-guided subgroup discovery. The main step of the subgroup discovery process, the induction of subgroup descriptions, is performed by a heuristic beam search algorithm, using a novel parametrized definition of rule quality which is analyzed in detail. The other important steps of the proposed subgroup discovery process are the detection of statistically significant properties of selected subgroups and subgroup visualization: statistically significant properties are used to enrich the descriptions of induced subgroups, while the visualization shows subgroup properties in the form of distributions of the numbers of examples in the subgroups. The approach is illustrated by the results obtained for a medical problem of early detection of patient risk groups.
Studying Properties of Czech Complex Sentences from an Annotated Corpus
Kubon, Vladislav (Charles University in Prague) | Lopatkova, Marketa (Charles University in Prague)
The paper deals with the problem of an analysis of complex sentences in Czech on the basis of manually annotated data. The availability of a specialized corpus explicitly describing mutual relationships between segments and clauses in Czech complex sentences, together with the availability of a thoroughly syntactically annotated corpus, the Prague Dependency Treebank, provide a solid background for linguistic investigation. The paper presents quantitative, linguistic and structural observations which provide a number of clues for building an algorithm for analyzing a structure of complex sentences in the future.
Hybrid Approach Combining Machine Learning and a Rule-Based Expert System for Text Categorization
Villena-Román, Julio (Universidad Carlos III de Madrid) | Collada-Pérez, Sonia (Daedalus - Data, Decisions and Language, S.A.) | Lana-Serrano, Sara (Universidad Politécnica de Madrid) | González-Cristóbal, José Carlos (Universidad Politécnica de Madrid)
This paper discusses a novel hybrid approach for text categorization that combines a machine learning algorithm, which provides a base model trained with a labeled corpus, with a rule-based expert system, which is used to improve the results provided by the previous classifier, by filtering false positives and dealing with false negatives. The main advantage is that the system can be easily fine-tuned by adding specific rules for those noisy or conflicting categories that have not been successfully trained. We also describe an implementation based on k-Nearest Neighbor and a simple rule language to express lists of positive, negative and relevant (multiword) terms appearing in the input text. The system is evaluated in several scenarios, including the popular Reuters-21578 news corpus for comparison to other approaches, and categorization using IPTC metadata, EUROVOC thesaurus and others. Results show that this approach achieves a precision that is comparable to top ranked methods, with the added value that it does not require a demanding human expert workload to train.
Contextual hypotheses and semantics of logic programs
Logic programming has developed as a rich field, built over a logical substratum whose main constituent is a nonclassical form of negation, sometimes coexisting with classical negation. The field has seen the advent of a number of alternative semantics, with Kripke-Kleene semantics, the well-founded semantics, the stable model semantics, and the answer-set semantics standing out as the most successful. We show that all aforementioned semantics are particular cases of a generic semantics, in a framework where classical negation is the unique form of negation and where the literals in the bodies of the rules can be `marked' to indicate that they can be the targets of hypotheses. A particular semantics then amounts to choosing a particular marking scheme and choosing a particular set of hypotheses. When a literal belongs to the chosen set of hypotheses, all marked occurrences of that literal in the body of a rule are assumed to be true, whereas the occurrences of that literal that have not been marked in the body of the rule are to be derived in order to contribute to the firing of the rule. Hence the notion of hypothetical reasoning that is presented in this framework is not based on making global assumptions, but more subtly on making local, contextual assumptions, taking effect as indicated by the chosen marking scheme on the basis of the chosen set of hypotheses. Our approach offers a unified view on the various semantics proposed in logic programming, classical in that only classical negation is used, and links the semantics of logic programs to mechanisms that endow rule-based systems with the power to harness hypothetical reasoning.
Closed-set-based Discovery of Bases of Association Rules
Balcázar, José L., García-Saiz, Diego, Gómez-Pérez, Domingo, Tîrnăucă, Cristina
The output of an association rule miner is often huge in practice. This is why several concise lossless representations have been proposed, such as the "essential" or "representative" rules. We revisit the algorithm given by Kryszkiewicz (Int. Symp. Intelligent Data Analysis 2001, Springer-Verlag LNCS 2189, 350-359) for mining representative rules. We show that its output is sometimes incomplete, due to an oversight in its mathematical validation. We propose alternative complete generators and we extend the approach to an existing closure-aware basis similar to, and often smaller than, the representative rules, namely the basis B*.
Combining Uncertainty and Description Logic Rule-Based Reasoning in Situation-Aware Robots
Krieger, Hans-Ulrich (DFKI GmbH, German Research Center For Artificial Intelligence) | Kruijff, Geert-Jan M. (DFKI GmbH, German Research Center For Artificial Intelligence)
The paper addresses how a robot can maintain a state representation of all that it knows about the environment over time and space, given its observations and its domain knowledge. The advantage in combining domain knowledge and observations is that the robot can in this way project from the past into the future, and reason from observations to more general statements to help guide how it plans to act and interact. The difficulty lies in the fact that observations are typically uncertain and logical inference for completion against a knowledge base is computationally hard.
On the Collaborative Formalization of Agile Semantics Using Social Network Applications
Fill, Hans-Georg (Stanford University) | Tudorache, Tania (Stanford University)
In this position paper we investigate the opportunities of using functionalities provided by social network sites for the collaborative formalization of semantics in the domain of health. In particular we identified benefits in regard to communication support, economic benefits, and technical opportunities. The implementation of the functionalities are illustrated by describing a use case from an ongoing project with the World Health Organization.
SBVR Business Rules Generation from Natural Language Specification
Bajwa, Imran Sarwar (University of Birmingham) | Lee, Mark G. (University of Birmingham) | Bordbar, Behzad (University of Birmingham)
In this paper, we present a novel approach of translating natural languages specification to SBVR business rules. The business rules constraint business structure or control behaviour of a business process. In modern business modelling, one of the important phases is writing business rules. Typically, a business rule analyst has to manually write hundreds of business rules in a natural language (NL) and then manually translate NL specification of all the rules in a particular rule language such as SBVR, or OCL, as required. However, the manual translation of NL rule specification to formal representation as SBVR rule is not only difficult, complex and time consuming but also can result in erroneous business rules. In this paper, we propose an automated approach that automatically translates the NL (such as English) specification of business rules to SBVR (Semantic Business Vocabulary and Rules) rules. The major challenge in NL to SBVR translation was complex semantic analysis of English language. We have used a rule based algorithm for robust semantic analysis of English and generate SBVR rules. Automated generation of SBVR based Business rules can help in improved and efficient constrained business aspects in a typical business modelling.