Goto

Collaborating Authors

 Evolutionary Systems


Generating Interpretable Fuzzy Controllers using Particle Swarm Optimization and Genetic Programming

arXiv.org Artificial Intelligence

Autonomously training interpretable control strategies, called policies, using pre-existing plant trajectory data is of great interest in industrial applications. Fuzzy controllers have been used in industry for decades as interpretable and efficient system controllers. In this study, we introduce a fuzzy genetic programming (GP) approach called fuzzy GP reinforcement learning (FGPRL) that can select the relevant state features, determine the size of the required fuzzy rule set, and automatically adjust all the controller parameters simultaneously. Each GP individual's fitness is computed using model-based batch reinforcement learning (RL), which first trains a model using available system samples and subsequently performs Monte Carlo rollouts to predict each policy candidate's performance. We compare FGPRL to an extended version of a related method called fuzzy particle swarm reinforcement learning (FPSRL), which uses swarm intelligence to tune the fuzzy policy parameters. Experiments using an industrial benchmark show that FGPRL is able to autonomously learn interpretable fuzzy policies with high control performance.


QSAR Classification Modeling for Bioactivity of Molecular Structure via SPL-Logsum

arXiv.org Machine Learning

Quantitative structure-activity relationship (QSAR) modelling is effective 'bridge' to search the reliable relationship related bioactivity to molecular structure. A QSAR classification model contains a lager number of redundant, noisy and irrelevant descriptors. To address this problem, various of methods have been proposed for descriptor selection. Generally, they can be grouped into three categories: filters, wrappers, and embedded methods. Regularization method is an important embedded technology, which can be used for continuous shrinkage and automatic descriptors selection. In recent years, the interest of researchers in the application of regularization techniques is increasing in descriptors selection , such as, logistic regression(LR) with $L_1$ penalty. In this paper, we proposed a novel descriptor selection method based on self-paced learning(SPL) with Logsum penalized LR for predicting the bioactivity of molecular structure. SPL inspired by the learning process of humans and animals that gradually learns from easy samples(smaller losses) to hard samples(bigger losses) samples into training and Logsum regularization has capacity to select few meaningful and significant molecular descriptors, respectively. Experimental results on simulation and three public QSAR datasets show that our proposed SPL-Logsum method outperforms other commonly used sparse methods in terms of classification performance and model interpretation.


A Hybrid Q-Learning Sine-Cosine-based Strategy for Addressing the Combinatorial Test Suite Minimization Problem

arXiv.org Artificial Intelligence

The sine-cosine algorithm (SCA) is a new population-based meta-heuristic algorithm. In addition to exploiting sine and cosine functions to perform local and global searches (hence the name sine-cosine), the SCA introduces several random and adaptive parameters to facilitate the search process. Although it shows promising results, the search process of the SCA is vulnerable to local minima/maxima due to the adoption of a fixed switch probability and the bounded magnitude of the sine and cosine functions (from -1 to 1). In this paper, we propose a new hybrid Q-learning sine-cosine- based strategy, called the Q-learning sine-cosine algorithm (QLSCA). Within the QLSCA, we eliminate the switching probability. Instead, we rely on the Q-learning algorithm (based on the penalty and reward mechanism) to dynamically identify the best operation during runtime. Additionally, we integrate two new operations (L\'evy flight motion and crossover) into the QLSCA to facilitate jumping out of local minima/maxima and enhance the solution diversity. To assess its performance, we adopt the QLSCA for the combinatorial test suite minimization problem. Experimental results reveal that the QLSCA is statistically superior with regard to test suite size reduction compared to recent state-of-the-art strategies, including the original SCA, the particle swarm test generator (PSTG), adaptive particle swarm optimization (APSO) and the cuckoo search strategy (CS) at the 95% confidence level. However, concerning the comparison with discrete particle swarm optimization (DPSO), there is no significant difference in performance at the 95% confidence level. On a positive note, the QLSCA statistically outperforms the DPSO in certain configurations at the 90% confidence level.


Multi-objective Architecture Search for CNNs

arXiv.org Machine Learning

Architecture search aims at automatically finding neural architectures that are competitive with architectures designed by human experts. While recent approaches have come close to matching the predictive performance of manually designed architectures for image recognition, these approaches are problematic under constrained resources for two reasons: first, the architecture search itself requires vast computational resources for most proposed methods. Secondly, the found neural architectures are solely optimized for high predictive performance without penalizing excessive resource consumption. We address the first shortcoming by proposing NASH, an architecture search which considerable reduces the computational resources required for training novel architectures by applying network morphisms and aggressive learning rate schedules. On CIFAR10, NASH finds architectures with errors below 4% in only 3 days. We address the second shortcoming by proposing Pareto-NASH, a method for multi-objective architecture search that allows approximating the Pareto-front of architectures under multiple objective, such as predictive performance and number of parameters, in a single run of the method. Within 56 GPU days of architecture search, Pareto-NASH finds a model with 4M parameters and test error of 3.5%, as well as a model with less than 1M parameters and test error of 4.6%.


Social Algorithms

arXiv.org Artificial Intelligence

To find solutions to problems commonly used in science and engineering, algorithms are required. An algorithm is a step-by-step computational procedure or a set of rules to be followed by a computer. One of the oldest algorithms is the Euclidean algorithm for finding the greatest common divisor (gcd) of two integers such as 12345 and 125, and this algorithm was first given in detail in Euclid's Elements about 2300 years ago (Chabert 1999). Modern computing involves a large set of different algorithms from fast Fourier transform (FFT) to image processing techniques and from conjugate gradient methods to finite element methods. Optimization problems in particular require specialized optimization techniques, ranging from the simple Newton-Raphson's method to more sophisticated simplex methods for linear programming. Modern trends tend to use a combination of traditional techniques in combination with contemporary stochastic metaheuristic algorithms such as genetic algorithms, firefly algorithm and particle swarm optimization.


RIOT: a Stochastic-based Method for Workflow Scheduling in the Cloud

arXiv.org Artificial Intelligence

Cloud computing provides engineers or scientists a place to run complex computing tasks. Finding a workflow's deployment configuration in a cloud environment is not easy. Traditional workflow scheduling algorithms were based on some heuristics, e.g. reliability greedy, cost greedy, cost-time balancing, etc., or more recently, the meta-heuristic methods, such as genetic algorithms. These methods are very slow and not suitable for rescheduling in the dynamic cloud environment. This paper introduces RIOT (Randomized Instance Order Types), a stochastic based method for workflow scheduling. RIOT groups the tasks in the workflow into virtual machines via a probability model and then uses an effective surrogate-based method to assess a large amount of potential scheduling. Experiments in dozens of study cases showed that RIOT executes tens of times faster than traditional methods while generating comparable results to other methods.


Swarm robotics in wireless distributed protocol design for coordinating robots involved in cooperative tasks

arXiv.org Artificial Intelligence

The mine detection in an unexplored area is an optimization problem where multiple mines, randomly distributed throughout an area, need to be discovered and disarmed in a minimum amount of time. We propose a strategy to explore an unknown area, using a stigmergy approach based on ants behavior, and a novel swarm based protocol to recruit and coordinate robots for disarming the mines cooperatively. Simulation tests are presented to show the effectiveness of our proposed Ant-based Task Robot Coordination (ATRC) with only the exploration task and with both exploration and recruiting strategies. Multiple minimization objectives have been considered: the robots' recruiting time and the overall area exploration time. We discuss, through simulation, different cases under different network and field conditions, performed by the robots. The results have shown that the proposed decentralized approaches enable the swarm of robots to perform cooperative tasks intelligently without any central control.


Key Algorithms and Statistical Models for Aspiring Data Scientists

@machinelearnbot

As a data scientist who has been in the profession for several years now, I am often approached for career advice or guidance in course selection related to machine learning by students and career switchers on LinkedIn and Quora. Some questions revolve around educational paths and program selection, but many questions focus on what sort of algorithms or models are common in data science today. With a glut of algorithms from which to choose, it's hard to know where to start. Courses may include algorithms that aren't typically used in industry today, and courses may exclude very useful methods that aren't trending at the moment. Software-based programs may exclude important statistical concepts, and mathematically-based programs may skip over some of the key topics in algorithm design. I've put together a short guide for aspiring data scientists, particularly focused on statistical models and machine learning models (supervised and unsupervised); many of these topics are covered in textbooks, graduate-level statistics courses, data science bootcamps, and other training resources (some of which are included in the reference section of the article).


Global Convergence Analysis of the Flower Pollination Algorithm: A Discrete-Time Markov Chain Approach

arXiv.org Artificial Intelligence

Flower pollination algorithm is a recent metaheuristic algorithm for solving nonlinear global optimization problems. The algorithm has also been extended to solve multiobjective optimization with promising results. In this work, we analyze this algorithm mathematically and prove its convergence properties by using Markov chain theory. By constructing the appropriate transition probability for a population of flower pollen and using the homogeneity property, it can be shown that the constructed stochastic sequences can converge to the optimal set. Under the two proper conditions for convergence, it is proved that the simplified flower pollination algorithm can indeed satisfy these convergence conditions and thus the global convergence of this algorithm can be guaranteed. Numerical experiments are used to demonstrate that the flower pollination algorithm can converge quickly in practice and can thus achieve global optimality efficiently.


SMOTE for Learning from Imbalanced Data: Progress and Challenges, Marking the 15-year Anniversary

Journal of Artificial Intelligence Research

The Synthetic Minority Oversampling Technique (SMOTE) preprocessing algorithm is considered "de facto" standard in the framework of learning from imbalanced data. This is due to its simplicity in the design of the procedure, as well as its robustness when applied to different type of problems. Since its publication in 2002, SMOTE has proven successful in a variety of applications from several different domains. SMOTE has also inspired several approaches to counter the issue of class imbalance, and has also significantly contributed to new supervised learning paradigms, including multilabel classification, incremental learning, semi-supervised learning, multi-instance learning, among others. It is standard benchmark for learning from imbalanced data. It is also featured in a number of different software packages -- from open source to commercial. In this paper, marking the fifteen year anniversary of SMOTE, we reflect on the SMOTE journey, discuss the current state of affairs with SMOTE, its applications, and also identify the next set of challenges to extend SMOTE for Big Data problems.