AITopics

2508.13436

Country:

Europe (1.00)
North America > United States (0.93)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)

Barbudo, Rafael, Ramírez, Aurora, Romero, José Raúl

Grammar-based evolutionary approach for automated workflow composition with domain-specific operators and ensemble diversity

arXiv.org Artificial IntelligenceFeb-3-2024

The process of extracting valuable and novel insights from raw data involves a series of complex steps. In the realm of Automated Machine Learning (AutoML), a significant research focus is on automating aspects of this process, specifically tasks like selecting algorithms and optimising their hyper-parameters. A particularly challenging task in AutoML is automatic workflow composition (AWC). AWC aims to identify the most effective sequence of data preprocessing and ML algorithms, coupled with their best hyper-parameters, for a specific dataset. However, existing AWC methods are limited in how many and in what ways they can combine algorithms within a workflow. Addressing this gap, this paper introduces EvoFlow, a grammar-based evolutionary approach for AWC. EvoFlow enhances the flexibility in designing workflow structures, empowering practitioners to select algorithms that best fit their specific requirements. EvoFlow stands out by integrating two innovative features. First, it employs a suite of genetic operators, designed specifically for AWC, to optimise both the structure of workflows and their hyper-parameters. Second, it implements a novel updating mechanism that enriches the variety of predictions made by different workflows. Promoting this diversity helps prevent the algorithm from overfitting. With this aim, EvoFlow builds an ensemble whose workflows differ in their misclassified instances. To evaluate EvoFlow's effectiveness, we carried out empirical validation using a set of classification benchmarks. We begin with an ablation study to demonstrate the enhanced performance attributable to EvoFlow's unique components. Then, we compare EvoFlow with other AWC approaches, encompassing both evolutionary and non-evolutionary techniques. Our findings show that EvoFlow's specialised genetic operators and updating mechanism substantially outperform current leading methods[..]

algorithm, dataset, workflow, (17 more...)

doi: 10.1016/j.asoc.2024.111292

2402.02124

Country:

Europe > Spain > Andalusia > Córdoba Province > Córdoba (0.04)
Oceania > New Zealand > North Island > Waikato (0.04)
North America > United States > California > San Mateo County > Menlo Park (0.04)

Genre:

Workflow (1.00)
Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

Nguyen, Giang, Biswas, Sumon, Rajan, Hridesh

Fix Fairness, Don't Ruin Accuracy: Performance Aware Fairness Repair using AutoML

arXiv.org Artificial IntelligenceAug-28-2023

Machine learning (ML) is increasingly being used in critical decision-making software, but incidents have raised questions about the fairness of ML predictions. To address this issue, new tools and methods are needed to mitigate bias in ML-based software. Previous studies have proposed bias mitigation algorithms that only work in specific situations and often result in a loss of accuracy. Our proposed solution is a novel approach that utilizes automated machine learning (AutoML) techniques to mitigate bias. Our approach includes two key innovations: a novel optimization function and a fairness-aware search space. By improving the default optimization function of AutoML and incorporating fairness objectives, we are able to mitigate bias with little to no loss of accuracy. Additionally, we propose a fairness-aware search space pruning method for AutoML to reduce computational cost and repair time. Our approach, built on the state-of-the-art Auto-Sklearn tool, is designed to reduce bias in real-world scenarios. In order to demonstrate the effectiveness of our approach, we evaluated our approach on four fairness problems and 16 different ML models, and our results show a significant improvement over the baseline and existing bias mitigation techniques. Our approach, Fair-AutoML, successfully repaired 60 out of 64 buggy cases, while existing bias mitigation techniques only repaired up to 44 out of 64 cases.

artificial intelligence, fair-automl, machine learning, (16 more...)

doi: 10.1145/3611643.3616257

2306.09297

Country:

North America > United States > California > San Francisco County > San Francisco (0.29)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
Oceania > Australia > Victoria > Melbourne (0.04)
(4 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Promising Solution (0.88)

Industry: Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.71)

arXiv.org Artificial IntelligenceAug-2-2023

Q(D)O-ES: Population-based Quality (Diversity) Optimisation for Post Hoc Ensemble Selection in AutoML

Purucker, Lennart, Schneider, Lennart, Anastacio, Marie, Beel, Joeran, Bischl, Bernd, Hoos, Holger

Automated machine learning (AutoML) systems commonly ensemble models post hoc to improve predictive performance, typically via greedy ensemble selection (GES). However, we believe that GES may not always be optimal, as it performs a simple deterministic greedy search. In this work, we introduce two novel population-based ensemble selection methods, QO-ES and QDO-ES, and compare them to GES. While QO-ES optimises solely for predictive performance, QDO-ES also considers the diversity of ensembles within the population, maintaining a diverse set of well-performing ensembles during optimisation based on ideas of quality diversity optimisation. The methods are evaluated using 71 classification datasets from the AutoML benchmark, demonstrating that QO-ES and QDO-ES often outrank GES, albeit only statistically significant on validation data. Our results further suggest that diversity can be beneficial for post hoc ensembling but also increases the risk of overfitting.

artificial intelligence, evolutionary algorithm, machine learning, (19 more...)

2307.08364

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
North America > Canada > British Columbia (0.04)
Europe > Netherlands > South Holland > Leiden (0.04)
Europe > Germany > North Rhine-Westphalia > Arnsberg Region > Siegen (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

#artificialintelligenceApr-11-2023, 13:30:34 GMT

Auto-Sklearn: How To Boost Performance and Efficiency Through Automated Machine Learning

Many of us are familiar with the challenge of selecting a suitable machine learning model for a specific prediction task, given the vast number of models to choose from. On top of that, we also need to find optimal hyperparameters in order to maximize our model's performance. These challenges can largely be overcome through automated machine learning, or AutoML. I say largely because, despite its name, the process is not fully automated and still requires some manual tweaking and decision-making by the user. Essentially, AutoML frees the user from the daunting and time-consuming tasks of data preprocessing, model selection, hyperparameter optimization, and ensemble building.

auto-sklearn, machine learning, optimization, (13 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Angarita-Zapata, Juan S., Masegosa, Antonio D., Triguero, Isaac

AutoEn: An AutoML method based on ensembles of predefined Machine Learning pipelines for supervised Traffic Forecasting

arXiv.org Artificial IntelligenceMar-19-2023

Intelligent Transportation Systems are producing tons of hardly manageable traffic data, which motivates the use of Machine Learning (ML) for data-driven applications, such as Traffic Forecasting (TF). TF is gaining relevance due to its ability to mitigate traffic congestion by forecasting future traffic states. However, TF poses one big challenge to the ML paradigm, known as the Model Selection Problem (MSP): deciding the most suitable combination of data preprocessing techniques and ML method for traffic data collected under different transportation circumstances. In this context, Automated Machine Learning (AutoML), the automation of the ML workflow from data preprocessing to model validation, arises as a promising strategy to deal with the MSP in problem domains wherein expert ML knowledge is not always an available or affordable asset, such as TF. Various AutoML frameworks have been used to approach the MSP in TF. Most are based on online optimisation processes to search for the best-performing pipeline on a given dataset. This online optimisation could be complemented with meta-learning to warm-start the search phase and/or the construction of ensembles using pipelines derived from the optimisation process. However, given the complexity of the search space and the high computational cost of tuning-evaluating pipelines generated, online optimisation is only beneficial when there is a long time to obtain the final model. Thus, we introduce AutoEn, which is a simple and efficient method for automatically generating multi-classifier ensembles from a predefined set of ML pipelines. We compare AutoEn against Auto-WEKA and Auto-sklearn, two AutoML methods commonly used in TF. Experimental results demonstrate that AutoEn can lead to better or more competitive results in the general-purpose domain and in TF.

artificial intelligence, dataset, machine learning, (16 more...)

2303.10732

Country:

North America > United States > Maryland (0.14)
Europe > Spain > Galicia > Madrid (0.04)
Europe > Spain > Basque Country > Biscay Province > Bilbao (0.04)
(5 more...)

Genre: Research Report > New Finding (0.48)

Industry: Transportation > Infrastructure & Services (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

#artificialintelligenceAug-20-2022, 09:55:41 GMT

Auto-sklearn: Efficient and Robust Automated Machine Learning

The success of machine learning in a broad range of applications has led to an ever-growing demand for machine learning systems that can be used off the shelf by non-experts. To be effective in practice, such systems need to automatically choose a good algorithm and feature preprocessing steps for a new dataset at hand, and also set their respective hyperparameters. Recent work has started to tackle this automated machine learning (AutoML) problem with the help of efficient Bayesian optimization methods. Building on this, we introduce a robust new AutoML system based on the Python machine learning package scikit-learn (using 15 classifiers, 14 feature preprocessing methods, and 4 data preprocessing methods, giving rise to a structured hypothesis space with 110 hyperparameters). This system, which we dub Auto-sklearn, improves on existing AutoML methods by automatically taking into account past performance on similar datasets, and by constructing ensembles from the models evaluated during the optimization.

auto-sklearn, hyperparameter, robust automated machine learning, (2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Helali, Mossad, Mansour, Essam, Abdelaziz, Ibrahim, Dolby, Julian, Srinivas, Kavitha

A Scalable AutoML Approach Based on Graph Neural Networks

arXiv.org Artificial IntelligenceJul-14-2022

AutoML systems build machine learning models automatically by performing a search over valid data transformations and learners, along with hyper-parameter optimization for each learner. Many AutoML systems use meta-learning to guide search for optimal pipelines. In this work, we present a novel meta-learning system called KGpip which, (1) builds a database of datasets and corresponding pipelines by mining thousands of scripts with program analysis, (2) uses dataset embeddings to find similar datasets in the database based on its content instead of metadata-based features, (3) models AutoML pipeline creation as a graph generation problem, to succinctly characterize the diverse pipelines seen for a single dataset. KGpip's meta-learning is a sub-component for AutoML systems. We demonstrate this by integrating KGpip with two AutoML systems. Our comprehensive evaluation using 126 datasets, including those used by the state-of-the-art systems, shows that KGpip significantly outperforms these systems.

dataset, graph, pipeline, (15 more...)

2111.00083

Country:

North America > United States > Wisconsin (0.04)
Asia > Middle East > Republic of Türkiye > İzmir Province > İzmir (0.04)
Asia > Bangladesh (0.04)
(2 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

#artificialintelligenceFeb-9-2022, 12:11:13 GMT

10-best-automl-tools-used-in-data-science-projects-for-2022

Automatic Machine Learning (AutoML), also known as AutoML services or tools, allows data scientists, machine learning engineers, and non-technical users to create scalable machinelearning models. Here's a list of the Top 10 AutoML Tools Used in Data Science Projects in 2022 AutoML tools automate this process by automatically breaking down information and selecting calculations models based on the experiences gained from information investigation. These models are created, tested, and refined using a subset the available data. Finally, the models that exhibit the best are presented to the client. AutoML TOOLS allow clients to choose between intricacy or execution.

platform, portfolio-based algorithm selection method, prescient model, (9 more...)

Country: Europe > Netherlands > South Holland > The Hague (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

#artificialintelligenceFeb-2-2022, 08:31:08 GMT

Auto-Sklearn: Accelerate your machine learning models with AutoML

AutoML is a relatively new and upcoming subset of machine learning. The main approach in AutoML is to limit the involvement of data scientists and let the tool handle all time-consuming processes in machine learning like data preprocessing, best algorithm selection, hyperparameter tuning, etc., thus saving time for setting up these ML models and speeding up their deployment. There are several AutoML tools available in the market these days. In one of my previous blogathon articles, I had shared a comprehensive guide to AutoML with an easy AutoGluon example. This guide included a list of several AutoML tools currently available in the market.

auto-sklearn, classifier, dataset, (16 more...)

Country: Europe > Germany (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)