Goto

Collaborating Authors

 Country


Combining Naive Bayes and Decision Tree for Adaptive Intrusion Detection

arXiv.org Artificial Intelligence

In this paper, a new learning algorithm for adaptive network intrusion detection using naive Bayesian classifier and decision tree is presented, which performs balance detections and keeps false positives at acceptable level for different types of network attacks, and eliminates redundant attributes as well as contradictory examples from training data that make the detection model complex. The proposed algorithm also addresses some difficulties of data mining such as handling continuous attribute, dealing with missing attribute values, and reducing noise in training data. Due to the large volumes of security audit data as well as the complex and dynamic properties of intrusion behaviours, several data miningbased intrusion detection techniques have been applied to network-based traffic data and host-based data in the last decades. However, there remain various issues needed to be examined towards current intrusion detection systems (IDS). We tested the performance of our proposed algorithm with existing learning algorithms by employing on the KDD99 benchmark intrusion detection dataset. The experimental results prove that the proposed algorithm achieved high detection rates (DR) and significant reduce false positives (FP) for different types of network intrusions using limited computational resources.


An axiomatic approach to the roughness measure of rough sets

arXiv.org Artificial Intelligence

In Pawlak's rough set theory, a set is approximated by a pair of lower and upper approximations. To measure numerically the roughness of an approximation, Pawlak introduced a quantitative measure of roughness by using the ratio of the cardinalities of the lower and upper approximations. Although the roughness measure is effective, it has the drawback of not being strictly monotonic with respect to the standard ordering on partitions. Recently, some improvements have been made by taking into account the granularity of partitions. In this paper, we approach the roughness measure in an axiomatic way. After axiomatically defining roughness measure and partition measure, we provide a unified construction of roughness measure, called strong Pawlak roughness measure, and then explore the properties of this measure. We show that the improved roughness measures in the literature are special instances of our strong Pawlak roughness measure and introduce three more strong Pawlak roughness measures as well. The advantage of our axiomatic approach is that some properties of a roughness measure follow immediately as soon as the measure satisfies the relevant axiomatic definition.


Evidence Algorithm and System for Automated Deduction: A Retrospective View

arXiv.org Artificial Intelligence

A research project aimed at the development of an automated theorem proving system was started in Kiev (Ukraine) in early 1960s. The mastermind of the project, Academician V.Glushkov, baptized it "Evidence Algorithm", EA. The work on the project lasted, off and on, more than 40 years. In the framework of the project, the Russian and English versions of the System for Automated Deduction, SAD, were constructed. They may be already seen as powerful theorem-proving assistants.


Inaccuracy Minimization by Partioning Fuzzy Data Sets - Validation of Analystical Methodology

arXiv.org Artificial Intelligence

In the last two decades, a number of methods have been proposed for forecasting based on fuzzy time series. Most of the fuzzy time series methods are presented for forecasting of car road accidents. However, the forecasting accuracy rates of the existing methods are not good enough. In this paper, we compared our proposed new method of fuzzy time series forecasting with existing methods. Our method is based on means based partitioning of the historical data of car road accidents. The proposed method belongs to the kth order and time-variant methods. The proposed method can get the best forecasting accuracy rate for forecasting the car road accidents than the existing methods.


Distantly Labeling Data for Large Scale Cross-Document Coreference

arXiv.org Artificial Intelligence

Cross-document coreference, the problem of resolving entity mentions across multi-document collections, is crucial to automated knowledge base construction and data mining tasks. However, the scarcity of large labeled data sets has hindered supervised machine learning research for this task. In this paper we develop and demonstrate an approach based on ``distantly-labeling'' a data set from which we can train a discriminative cross-document coreference model. In particular we build a dataset of more than a million people mentions extracted from 3.5 years of New York Times articles, leverage Wikipedia for distant labeling with a generative model (and measure the reliability of such labeling); then we train and evaluate a conditional random field coreference model that has factors on cross-document entities as well as mention-pairs. This coreference model obtains high accuracy in resolving mentions and entities that are not present in the training data, indicating applicability to non-Wikipedia data. Given the large amount of data, our work is also an exercise demonstrating the scalability of our approach.


BnB-ADOPT: An Asynchronous Branch-and-Bound DCOP Algorithm

Journal of Artificial Intelligence Research

Distributed constraint optimization (DCOP) problems are a popular way of formulating and solving agent-coordination problems. A DCOP problem is a problem where several agents coordinate their values such that the sum of the resulting constraint costs is minimal. It is often desirable to solve DCOP problems with memory-bounded and asynchronous algorithms. We introduce Branch-and-Bound ADOPT (BnB-ADOPT), a memory-bounded asynchronous DCOP search algorithm that uses the message-passing and communication framework of ADOPT (Modi, Shen, Tambe, & Yokoo, 2005), a well known memory-bounded asynchronous DCOP search algorithm, but changes the search strategy of ADOPT from best-first search to depth-first branch-and-bound search. Our experimental results show that BnB-ADOPT finds cost-minimal solutions up to one order of magnitude faster than ADOPT for a variety of large DCOP problems and is as fast as NCBB, a memory-bounded synchronous DCOP search algorithm, for most of these DCOP problems. Additionally, it is often desirable to find bounded-error solutions for DCOP problems within a reasonable amount of time since finding cost-minimal solutions is NP-hard. The existing bounded-error approximation mechanism allows users only to specify an absolute error bound on the solution cost but a relative error bound is often more intuitive. Thus, we present two new bounded-error approximation mechanisms that allow for relative error bounds and implement them on top of BnB-ADOPT.


Change in Abstract Argumentation Frameworks: Adding an Argument

Journal of Artificial Intelligence Research

In this paper, we address the problem of change in an abstract argumentation system. We focus on a particular change: the addition of a new argument which interacts with previous arguments. We study the impact of such an addition on the outcome of the argumentation system, more particularly on the set of its extensions. Several properties for this change operation are defined by comparing the new set of extensions to the initial one, these properties are called structural when the comparisons are based on set-cardinality or set-inclusion relations. Several other properties are proposed where comparisons are based on the status of some particular arguments: the accepted arguments; these properties refer to the evolution of this status during the change, e.g., Monotony and Priority to Recency. All these properties may be more or less desirable according to specific applications. They are studied under two particular semantics: the grounded and preferred semantics.


A Soft Computing Model for Physicians' Decision Process

arXiv.org Artificial Intelligence

In this paper the author presents a kind of Soft Computing Technique, mainly an application of fuzzy set theory of Prof. Zadeh [16], on a problem of Medical Experts Systems. The choosen problem is on design of a physician's decision model which can take crisp as well as fuzzy data as input, unlike the traditional models. The author presents a mathematical model based on fuzzy set theory for physician aided evaluation of a complete representation of information emanating from the initial interview including patient past history, present symptoms, and signs observed upon physical examination and results of clinical and diagnostic tests.


Some distance bounds of branching processes and their diffusion limits

arXiv.org Machine Learning

We compute exact values respectively bounds of "distances" - in the sense of (transforms of) power divergences and relative entropy - between two discrete-time Galton-Watson branching processes with immigration GWI for which the offspring as well as the immigration is arbitrarily Poisson-distributed (leading to arbitrary type of criticality). Implications for asymptotic distinguishability behaviour in terms of contiguity and entire separation of the involved GWI are given, too. Furthermore, we determine the corresponding limit quantities for the context in which the two GWI converge to Feller-type branching diffusion processes, as the time-lags between observations tend to zero. Some applications to (static random environment like) Bayesian decision making and Neyman-Pearson testing are presented as well.


Feature Selection Using Regularization in Approximate Linear Programs for Markov Decision Processes

arXiv.org Artificial Intelligence

Approximate dynamic programming has been used successfully in a large variety of domains, but it relies on a small set of provided approximation features to calculate solutions reliably. Large and rich sets of features can cause existing algorithms to overfit because of a limited number of samples. We address this shortcoming using $L_1$ regularization in approximate linear programming. Because the proposed method can automatically select the appropriate richness of features, its performance does not degrade with an increasing number of features. These results rely on new and stronger sampling bounds for regularized approximate linear programs. We also propose a computationally efficient homotopy method. The empirical evaluation of the approach shows that the proposed method performs well on simple MDPs and standard benchmark problems.