Goto

Collaborating Authors

 Evolutionary Systems


Swarm Programming Using Moth-Flame Optimization and Whale Optimization Algorithms

arXiv.org Artificial Intelligence

Automatic programming (AP) is an important area of Machine Learning (ML) where computer programs are generated automatically. Swarm Programming (SP), a newly emerging research area in AP, automatically generates the computer programs using Swarm Intelligence (SI) algorithms. This paper presents two grammar-based SP methods named as Grammatical Moth-Flame Optimizer (GMFO) and Grammatical Whale Optimizer (GWO). The Moth-Flame Optimizer and Whale Optimization algorithm are used as search engines or learning algorithms in GMFO and GWO respectively. The proposed methods are tested on Santa Fe Ant Trail, quartic symbolic regression, and 3-input multiplexer problems. The results are compared with Grammatical Bee Colony (GBC) and Grammatical Fireworks algorithm (GFWA). The experimental results demonstrate that the proposed SP methods can be used in automatic computer program generation.


Variance Reduction for Better Sampling in Continuous Domains

arXiv.org Machine Learning

Design of experiments, random search, initialization of population-based methods, or sampling inside an epoch of an evolutionary algorithm use a sample drawn according to some probability distribution for approximating the location of an optimum. Recent papers have shown that the optimal search distribution, used for the sampling, might be more peaked around the center of the distribution than the prior distribution modelling our uncertainty about the location of the optimum. We confirm this statement, provide explicit values for this reshaping of the search distribution depending on the population size $\lambda$ and the dimension $d$, and validate our results experimentally.


Symbolic Regression Driven by Training Data and Prior Knowledge

arXiv.org Artificial Intelligence

In symbolic regression, the search for analytic models is typically driven purely by the prediction error observed on the training data samples. However, when the data samples do not sufficiently cover the input space, the prediction error does not provide sufficient guidance toward desired models. Standard symbolic regression techniques then yield models that are partially incorrect, for instance, in terms of their steady-state characteristics or local behavior. If these properties were considered already during the search process, more accurate and relevant models could be produced. We propose a multi-objective symbolic regression approach that is driven by both the training data and the prior knowledge of the properties the desired model should manifest. The properties given in the form of formal constraints are internally represented by a set of discrete data samples on which candidate models are exactly checked. The proposed approach was experimentally evaluated on three test problems with results clearly demonstrating its capability to evolve realistic models that fit the training data well while complying with the prior knowledge of the desired model characteristics at the same time. It outperforms standard symbolic regression by several orders of magnitude in terms of the mean squared deviation from a reference model.


Adversarial Machine Learning in Network Intrusion Detection Systems

arXiv.org Machine Learning

It is becoming evident each and every day that machine learning algorithms are achieving impressive results in domains in which it is hard to specify a set of rules for their procedures. Examples of this phenomenon include industries like finance [49, 5], transportation [37], education [42, 22], health care [23] and tasks like image recognition [41, 16, 17], machine translation [43, 7], and speech recognition [46, 24, 53, 50]. Motivated by the ease of adoption and the increased availability of affordable computational power (especially cloud computing services), machine learning algorithms are being explored in almost every commercial application and are offering great promise for the future of automation. Facing such a vast adoption across multiple disciplines, some of their weaknesses are exposed and sometimes exploited by malicious actors. For example, a common challenge to these algorithms is "generalization" or "robustness", which is the ability of the algorithm to maintain performance whenever dealing with data coming from a different distribution with which it was trained. For a long period of time, the sole focus of machine learning researchers was improving the performance of machine learning systems (true positive rate, accuracy, etc.). Nowadays, the robustness of these systems can no longer be ignored; many of them have been shown to be highly vulnerable to intentional adversarial attacks.


Constructing Complexity-efficient Features in XCS with Tree-based Rule Conditions

arXiv.org Artificial Intelligence

A major goal of machine learning is to create techniques that abstract away irrelevant information. The generalisation property of standard Learning Classifier System (LCS) removes such information at the feature level but not at the feature interaction level. Code Fragments (CFs), a form of tree-based programs, introduced feature manipulation to discover important interactions, but they often contain irrelevant information, which causes structural inefficiency. XOF is a recently introduced LCS that uses CFs to encode building blocks of knowledge about feature interaction. This paper aims to optimise the structural efficiency of CFs in XOF. We propose two measures to improve constructing CFs to achieve this goal. Firstly, a new CF-fitness update estimates the applicability of CFs that also considers the structural complexity. The second measure we can use is a niche-based method of generating CFs. These approaches were tested on Even-parity and Hierarchical problems, which require highly complex combinations of input features to capture the data patterns. The results show that the proposed methods significantly increase the structural efficiency of CFs, which is estimated by the rule "generality rate". This results in faster learning performance in the Hierarchical Majority-on problem. Furthermore, a user-set depth limit for CF generation is not needed as the learning agent will not adopt higher-level CFs once optimal CFs are constructed.


Multi-Objective Evolutionary approach for the Performance Improvement of Learners using Ensembling Feature selection and Discretization Technique on Medical data

arXiv.org Artificial Intelligence

Biomedical data is filled with continuous real values; these values in the feature set tend to create problems like underfitting, the curse of dimensionality and increase in misclassification rate because of higher variance. In response, pre-processing techniques on dataset minimizes the side effects and have shown success in maintaining the adequate accuracy. Feature selection and discretization are the two necessary preprocessing steps that were effectively employed to handle the data redundancies in the biomedical data. However, in the previous works, the absence of unified effort by integrating feature selection and discretization together in solving the data redundancy problem leads to the disjoint and fragmented field. This paper proposes a novel multi-objective based dimensionality reduction framework, which incorporates both discretization and feature reduction as an ensemble model for performing feature selection and discretization. Selection of optimal features and the categorization of discretized and non-discretized features from the feature subset is governed by the multi-objective genetic algorithm (NSGA-II). The two objective, minimizing the error rate during the feature selection and maximizing the information gain while discretization is considered as fitness criteria.


On the Combined Impact of Population Size and Sub-problem Selection in MOEA/D

arXiv.org Artificial Intelligence

This paper intends to understand and to improve the working principle of decomposition-based multi-objective evolutionary algorithms. We review the design of the well-established Moea/d framework to support the smooth integration of different strategies for sub-problem selection, while emphasizing the role of the population size and of the number of offspring created at each generation. By conducting a comprehensive empirical analysis on a wide range of multi-and many-objective combinatorial NK landscapes, we provide new insights into the combined effect of those parameters on the anytime performance of the underlying search process. In particular, we show that even a simple random strategy selecting sub-problems at random outperforms existing sophisticated strategies. We also study the sensitivity of such strategies with respect to the ruggedness and the objective space dimension of the target problem.


A Tailored NSGA-III Instantiation for Flexible Job Shop Scheduling

arXiv.org Artificial Intelligence

A customized multi-objective evolutionary algorithm (MOEA) is proposed for the multi-objective flexible job shop scheduling problem (FJSP). It uses smart initialization approaches to enrich the first generated population, and proposes various crossover operators to create a better diversity of offspring. Especially, the MIP-EGO configurator, which can tune algorithm parameters, is adopted to automatically tune operator probabilities. Furthermore, different local search strategies are employed to explore the neighborhood for better solutions. In general, the algorithm enhancement strategy can be integrated with any standard EMO algorithm. In this paper, it has been combined with NSGA-III to solve benchmark multi-objective FJSPs, whereas an off-the-shelf implementation of NSGA-III is not capable of solving the FJSP. The experimental results show excellent performance with less computing budget.


Augmentation of the Reconstruction Performance of Fuzzy C-Means with an Optimized Fuzzification Factor Vector

arXiv.org Artificial Intelligence

Information granules have been considered to be the fundamental constructs of Granular Computing (GrC). As a useful unsupervised learning technique, Fuzzy C-Means (FCM) is one of the most frequently used methods to construct information granules. The FCM-based granulation-degranulation mechanism plays a pivotal role in GrC. In this paper, to enhance the quality of the degranulation (reconstruction) process, we augment the FCM-based degranulation mechanism by introducing a vector of fuzzification factors (fuzzification factor vector) and setting up an adjustment mechanism to modify the prototypes and the partition matrix. The design is regarded as an optimization problem, which is guided by a reconstruction criterion. In the proposed scheme, the initial partition matrix and prototypes are generated by the FCM. Then a fuzzification factor vector is introduced to form an appropriate fuzzification factor for each cluster to build up an adjustment scheme of modifying the prototypes and the partition matrix. With the supervised learning mode of the granulation-degranulation process, we construct a composite objective function of the fuzzification factor vector, the prototypes and the partition matrix. Subsequently, the particle swarm optimization (PSO) is employed to optimize the fuzzification factor vector to refine the prototypes and develop the optimal partition matrix. Finally, the reconstruction performance of the FCM algorithm is enhanced. We offer a thorough analysis of the developed scheme. In particular, we show that the classical FCM algorithm forms a special case of the proposed scheme. Experiments completed for both synthetic and publicly available datasets show that the proposed approach outperforms the generic data reconstruction approach.


Introduction to Evolutionary Algorithms

#artificialintelligence

Evolution by natural selection is a scientific theory which aims to explain how natural systems evolved over time into more complex systems. In evolutionary algorithms, a fitness value can be used as a guide to indicate how close we are to a solution (eg. the higher the value, the closer we are to our desired objective). By grouping closer together all the elements in a population which share a similar fitnesses and further apart all the dissimilar elements, we can then construct a Fitness Landscape (Figure 1). One of the main problems faced by evolutionary algorithms is the presence of local optima in the fitness landscape. Local optima, can, in fact, mislead our algorithm to not reach our desired global maxima in favour of a less optimal solution.