Evolutionary Systems
Large-scale multi-objective influence maximisation with network downscaling
Cunegatti, Elia, Iacca, Giovanni, Bucur, Doina
Finding the most influential nodes in a network is a computationally hard problem with several possible applications in various kinds of network-based problems. While several methods have been proposed for tackling the influence maximisation (IM) problem, their runtime typically scales poorly when the network size increases. Here, we propose an original method, based on network downscaling, that allows a multi-objective evolutionary algorithm (MOEA) to solve the IM problem on a reduced scale network, while preserving the relevant properties of the original network. The downscaled solution is then upscaled to the original network, using a mechanism based on centrality metrics such as PageRank. Our results on eight large networks (including two with $\sim$50k nodes) demonstrate the effectiveness of the proposed method with a more than 10-fold runtime gain compared to the time needed on the original network, and an up to $82\%$ time reduction compared to CELF.
Fast Computation of Highly G-optimal Exact Designs via Particle Swarm Optimization
Walsh, Stephen J., Borkowski, John J.
Computing proposed exact $G$-optimal designs for response surface models is a difficult computation that has received incremental improvements via algorithm development in the last two-decades. These optimal designs have not been considered widely in applications in part due to the difficulty and cost involved with computing them. Three primary algorithms for constructing exact $G$-optimal designs are presented in the literature: the coordinate exchange (CEXCH), a genetic algorithm (GA), and the relatively new $G$-optimal via $I_\lambda$-optimality algorithm ($G(I_\lambda)$-CEXCH) which was developed in part to address large computational cost. Particle swarm optimization (PSO) has achieved widespread use in many applications, but to date, its broad-scale success notwithstanding, has seen relatively few applications in optimal design problems. In this paper we develop an extension of PSO to adapt it to the optimal design problem. We then employ PSO to generate optimal designs for several scenarios covering $K = 1, 2, 3, 4, 5$ design factors, which are common experimental sizes in industrial experiments. We compare these results to all $G$-optimal designs published in last two decades of literature. Published $G$-optimal designs generated by GA for $K=1, 2, 3$ factors have stood unchallenged for 14 years. We demonstrate that PSO has found improved $G$-optimal designs for these scenarios, and it does this with comparable computational cost to the state-of-the-art algorithm $G(I_\lambda)$-CEXCH. Further, we show that PSO is able to produce equal or better $G$-optimal designs for $K= 4, 5$ factors than those currently known. These results suggest that PSO is superior to existing approaches for efficiently generating highly $G$-optimal designs.
SubStrat: A Subset-Based Strategy for Faster AutoML
Lazebnik, Teddy, Somech, Amit, Weinberg, Abraham Itzhak
Automated machine learning (AutoML) frameworks have become important tools in the data scientists' arsenal, as they dramatically reduce the manual work devoted to the construction of ML pipelines. Such frameworks intelligently search among millions of possible ML pipelines - typically containing feature engineering, model selection and hyper parameters tuning steps - and finally output an optimal pipeline in terms of predictive accuracy. However, when the dataset is large, each individual configuration takes longer to execute, therefore the overall AutoML running times become increasingly high. To this end, we present SubStrat, an AutoML optimization strategy that tackles the data size, rather than configuration space. It wraps existing AutoML tools, and instead of executing them directly on the entire dataset, SubStrat uses a genetic-based algorithm to find a small yet representative data subset which preserves a particular characteristic of the full data. It then employs the AutoML tool on the small subset, and finally, it refines the resulted pipeline by executing a restricted, much shorter, AutoML process on the large dataset. Our experimental results, performed on two popular AutoML frameworks, Auto-Sklearn and TPOT, show that SubStrat reduces their running times by 79% (on average), with less than 2% average loss in the accuracy of the resulted ML pipeline.
Analysis, Characterization, Prediction and Attribution of Extreme Atmospheric Events with Machine Learning: a Review
Salcedo-Sanz, Sancho, Pérez-Aracil, Jorge, Ascenso, Guido, Del Ser, Javier, Casillas-Pérez, David, Kadow, Christopher, Fister, Dusan, Barriopedro, David, García-Herrera, Ricardo, Restelli, Marcello, Giuliani, Mateo, Castelletti, Andrea
Atmospheric Extreme Events (EEs) cause severe damages to human societies and ecosystems. The frequency and intensity of EEs and other associated events are increasing in the current climate change and global warming risk. The accurate prediction, characterization, and attribution of atmospheric EEs is therefore a key research field, in which many groups are currently working by applying different methodologies and computational tools. Machine Learning (ML) methods have arisen in the last years as powerful techniques to tackle many of the problems related to atmospheric EEs. This paper reviews the ML algorithms applied to the analysis, characterization, prediction, and attribution of the most important atmospheric EEs. A summary of the most used ML techniques in this area, and a comprehensive critical review of literature related to ML in EEs, are provided. A number of examples is discussed and perspectives and outlooks on the field are drawn.
Results of Deep Funding -- Round 1
This marks a new phase in the SingularityNET ecosystem, where we will foster the growth of the platform by supporting projects with AGIX tokens, knowledge and experience. We are very happy to present the projects that have been selected by our engaged community to be awarded with their requested amounts. While the portal was open, a total of 47 proposals were submitted for the $1million worth of AGIX token treasury funds, which made this round a fair success! After reviewing the proposals on their formal compliance to the Deep Funding rules, only 28 made it to the voting round. All of these 28 had more than the required 1% of cast votes, but only a minority of 12 proposals received an average grade of 6,5 or higher.
Implementing Particle Swarm Optimization in Tensorflow
Originally published on Towards AI the World's Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses. It's free, we don't spam, and we never share your email address.
Evolutionary scheduling of university activities based on consumption forecasts to minimise electricity costs
Ruddick, Julian, Genov, Evgenii, Camargo, Luis Ramirez, Coosemans, Thierry, Messagie, Maarten
This paper presents a solution to a predict then optimise problem which goal is to reduce the electricity cost of a university campus. The proposed methodology combines a multi-dimensional time series forecast and a novel approach to large-scale optimization. Gradient-boosting method is applied to forecast both generation and consumption time-series of the Monash university campus for the month of November 2020. For the consumption forecasts we employ log transformation to model trend and stabilize variance. Additional seasonality and trend features are added to the model inputs when applicable. The forecasts obtained are used as the base load for the schedule optimisation of university activities and battery usage. The goal of the optimisation is to minimize the electricity cost consisting of the price of electricity and the peak electricity tariff both altered by the load from class activities and battery use as well as the penalty of not scheduling some optional activities. The schedule of the class activities is obtained through evolutionary optimisation using the covariance matrix adaptation evolution strategy and the genetic algorithm. This schedule is then improved through local search by testing possible times for each activity one-by-one. The battery schedule is formulated as a mixed-integer programming problem and solved by the Gurobi solver. This method obtains the second lowest cost when evaluated against 6 other methods presented at an IEEE competition that all used mixed-integer programming and the Gurobi solver to schedule both the activities and the battery use. The code and data used for the paper are publicly available.
Massive Twinning to Enhance Emergent Intelligence
Yuan, Siyu, Han, Bin, Krummacker, Dennis, Schotten, Hans D.
As a complement to conventional AI solutions, emergent intelligence (EI) exhibits competitiveness in 6G IIoT scenario for its various outstanding features including robustness, protection to privacy, and scalability. However, despite the low computational complexity, EI is challenged by its high demand of data traffic in massive deployment. We propose to leverage massive twinning, which 6G is envisaged to support, to reduce the data traffic in EI and therewith enhance its performance.
On the Prediction of Evaporation in Arid Climate Using Machine Learning Model
Evaporation calculations are important for the proper management of hydrological resources, such as reservoirs, lakes, and rivers. Data-driven approaches, such as adaptive neuro fuzzy inference, are getting popular in many hydrological fields. This paper investigates the effective implementation of artificial intelligence on the prediction of evaporation for agricultural area. In particular, it presents the adaptive neuro fuzzy inference system (ANFIS) and hybridization of ANFIS with three optimizers, which include the genetic algorithm (GA), firefly algorithm (FFA), and particle swarm optimizer (PSO). Six different measured weather variables are taken for the proposed modelling approach, including the maximum, minimum, and average air temperature, sunshine hours, wind speed, and relative humidity of a given location. Models are separately calibrated with a total of 86 data points over an eight-year period, from 2010 to 2017, at the specified station, located in Arizona, United States of America. Farming lands and humid climates are the reason for choosing this location. Ten statistical indices are calculated to find the best fit model. Comparisons shows that ANFIS and ANFIS–PSO are slightly better than ANFIS–FFA and ANFIS–GA. Though the hybrid ANFIS–PSO (R2= 0.99, VAF = 98.85, RMSE = 9.73, SI = 0.05) is very close to the ANFIS (R2 = 0.99, VAF = 99.04, RMSE = 8.92, SI = 0.05) model, preference can be given to ANFIS, due to its simplicity and easy operation.
Mono-surrogate vs Multi-surrogate in Multi-objective Bayesian Optimisation
Bayesian optimisation (BO) has been widely used to solve problems with expensive function evaluations. In multi-objective optimisation problems, BO aims to find a set of approximated Pareto optimal solutions. There are typically two ways to build surrogates in multi-objective BO: One surrogate by aggregating objective functions (by using a scalarising function, also called mono-surrogate approach) and multiple surrogates (for each objective function, also called multi-surrogate approach). In both approaches, an acquisition function (AF) is used to guide the search process. Mono-surrogate has the advantage that only one model is used, however, the approach has two major limitations. Firstly, the fitness landscape of the scalarising function and the objective functions may not be similar. Secondly, the approach assumes that the scalarising function distribution is Gaussian, and thus a closed-form expression of the AF can be used. In this work, we overcome these limitations by building a surrogate model for each objective function and show that the scalarising function distribution is not Gaussian. We approximate the distribution using Generalised extreme value distribution. The results and comparison with existing approaches on standard benchmark and real-world optimisation problems show the potential of the multi-surrogate approach.