AITopics

2201.0975

Country: Europe > Netherlands (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.88)
(2 more...)

arXiv.org Artificial IntelligenceJan-13-2022

Automated Reinforcement Learning: An Overview

Afshar, Reza Refaei, Zhang, Yingqian, Vanschoren, Joaquin, Kaymak, Uzay

Reinforcement Learning and recently Deep Reinforcement Learning are popular methods for solving sequential decision making problems modeled as Markov Decision Processes. RL modeling of a problem and selecting algorithms and hyper-parameters require careful considerations as different configurations may entail completely different performances. These considerations are mainly the task of RL experts; however, RL is progressively becoming popular in other fields where the researchers and system designers are not RL experts. Besides, many modeling decisions, such as defining state and action space, size of batches and frequency of batch updating, and number of timesteps are typically made manually. For these reasons, automating different components of RL framework is of great importance and it has attracted much attention in recent years. Automated RL provides a framework in which different components of RL including MDP modeling, algorithm selection and hyper-parameter optimization are modeled and defined automatically. In this article, we explore the literature and present recent work that can be used in automated RL. Moreover, we discuss the challenges, open questions and research directions in AutoRL.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

2201.05

Country: Europe > Netherlands (0.14)

Genre: Research Report > New Finding (0.67)

Industry:

Information Technology (0.67)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.88)

arXiv.org Artificial IntelligenceNov-4-2021

From Strings to Data Science: a Practical Framework for Automated String Handling

van Lith, John W., Vanschoren, Joaquin

Many machine learning libraries require that string features be converted to a numerical representation for the models to work as intended. Categorical string features can represent a wide variety of data (e.g., zip codes, names, marital status), and are notoriously difficult to preprocess automatically. In this paper, we propose a framework to do so based on best practices, domain knowledge, and novel techniques. It automatically identifies different types of string features, processes them accordingly, and encodes them into numerical representations. We also provide an open source Python implementation to automatically preprocess categorical string data in tabular datasets and demonstrate promising results on a wide range of datasets.

data quality, machine learning, natural language, (19 more...)

2111.01868

Country:

North America > United States (0.28)
Europe > United Kingdom (0.28)

Genre: Research Report > Promising Solution (0.34)

Industry: Leisure & Entertainment (0.94)

Technology:

Information Technology > Software (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

arXiv.org Machine LearningJun-11-2021

Meta-Learning for Symbolic Hyperparameter Defaults

Gijsbers, Pieter, Pfisterer, Florian, van Rijn, Jan N., Bischl, Bernd, Vanschoren, Joaquin

Hyperparameter optimization in machine learning (ML) deals with the problem of empirically learning an optimal algorithm configuration from data, usually formulated as a black-box optimization problem. In this work, we propose a zero-shot method to meta-learn symbolic default hyperparameter configurations that are expressed in terms of the properties of the dataset. This enables a much faster, but still data-dependent, configuration of the ML algorithm, compared to standard hyperparameter optimization approaches. In the past, symbolic and static default values have usually been obtained as hand-crafted heuristics. We propose an approach of learning such symbolic configurations as formulas of dataset properties from a large set of prior evaluations on multiple datasets by optimizing over a grammar of expressions using an evolutionary algorithm. We evaluate our method on surrogate empirical performance models as well as on real data across 6 ML algorithms on more than 100 datasets and demonstrate that our method indeed finds viable symbolic defaults.

artificial intelligence, default, optimization problem, (19 more...)

2106.05767

Country:

Europe > Netherlands (0.28)
Europe > Germany (0.28)
North America > United States (0.28)

Genre:

Research Report > Experimental Study (0.67)
Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

arXiv.org Machine LearningJan-6-2021

Hyperboost: Hyperparameter Optimization by Gradient Boosting surrogate models

van Hoof, Jeroen, Vanschoren, Joaquin

Bayesian Optimization is a popular tool for tuning algorithms in automatic machine learning (AutoML) systems. Current state-of-the-art methods leverage Random Forests or Gaussian processes to build a surrogate model that predicts algorithm performance given a certain set of hyperparameter settings. In this paper, we propose a new surrogate model based on gradient boosting, where we use quantile regression to provide optimistic estimates of the performance of an unobserved hyperparameter setting, and combine this with a distance metric between unobserved and observed hyperparameter settings to help regulate exploration. We demonstrate empirically that the new method is able to outperform some state-of-the art techniques across a reasonable sized set of classification problems.

configuration, decision tree learning, survey article, (16 more...)

2101.02289

Country: North America > Canada > Alberta (0.14)

Genre: Research Report > Promising Solution (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

arXiv.org Artificial IntelligenceJan-5-2021

Theory-based Habit Modeling for Enhancing Behavior Prediction

Zhang, Chao, Vanschoren, Joaquin, van Wissen, Arlette, Lakens, Daniel, de Ruyter, Boris, IJsselsteijn, Wijnand A.

Psychological theories of habit posit that when a strong habit is formed through behavioral repetition, it can trigger behavior automatically in the same environment. Given the reciprocal relationship between habit and behavior, changing lifestyle behaviors (e.g., toothbrushing) is largely a task of breaking old habits and creating new and healthy ones. Thus, representing users' habit strengths can be very useful for behavior change support systems (BCSS), for example, to predict behavior or to decide when an intervention reaches its intended effect. However, habit strength is not directly observable and existing self-report measures are taxing for users. In this paper, built on recent computational models of habit formation, we propose a method to enable intelligent systems to compute habit strength based on observable behavior. The hypothesized advantage of using computed habit strength for behavior prediction was tested using data from two intervention studies, where we trained participants to brush their teeth twice a day for three weeks and monitored their behaviors using accelerometers. Through hierarchical cross-validation, we found that for the task of predicting future brushing behavior, computed habit strength clearly outperformed self-reported habit strength (in both studies) and was also superior to models based on past behavior frequency (in the larger second study). Our findings provide initial support for our theory-based approach of modeling user habits and encourages the use of habit computation to deliver personalized and adaptive interventions.

artificial intelligence, habit strength, health & medicine, (19 more...)

2101.01637

Country: Europe > Netherlands (0.16)

Genre:

Research Report > Experimental Study (0.67)
Research Report > New Finding (0.66)

Industry: Health & Medicine > Therapeutic Area (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Data Science (0.93)

arXiv.org Machine LearningJul-15-2020

Importance of Tuning Hyperparameters of Machine Learning Algorithms

Weerts, Hilde J. P., Mueller, Andreas C., Vanschoren, Joaquin

The performance of many machine learning algorithms depends on their hyperparameter settings. The goal of this study is to determine whether it is important to tune a hyperparameter or whether it can be safely set to a default value. We present a methodology to determine the importance of tuning a hyperparameter based on a non-inferiority test and tuning risk: the performance loss that is incurred when a hyperparameter is not tuned, but set to a default value. Because our methods require the notion of a default parameter, we present a simple procedure that can be used to determine reasonable default parameters. We apply our methods in a benchmark study using 59 datasets from OpenML. Our results show that leaving particular hyperparameters at their default value is non-inferior to tuning these hyperparameters. In some cases, leaving the hyperparameter at its default value even outperforms tuning it using a search procedure with a limited number of iterations.

artificial intelligence, hyperparameter, machine learning, (17 more...)

2007.07588

Country:

Europe > Netherlands (0.14)
North America > United States (0.14)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

arXiv.org Machine LearningJul-9-2020

GAMA: a General Automated Machine learning Assistant

Gijsbers, Pieter, Vanschoren, Joaquin

The General Automated Machine learning Assistant (GAMA) is a modular AutoML system developed to empower users to track and control how AutoML algorithms search for optimal machine learning pipelines, and facilitate AutoML research itself. In contrast to current, often black-box systems, GAMA allows users to plug in different AutoML and post-processing techniques, logs and visualizes the search process, and supports easy benchmarking. It currently features three AutoML search algorithms, two model post-processing steps, and is designed to allow for more components to be added.

air transportation, artificial intelligence, gama, (17 more...)

2007.04911

Country:

Europe > Netherlands (0.15)
Europe > Portugal (0.15)

Genre: Research Report (0.40)

Industry: Transportation > Air (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.93)

arXiv.org Machine LearningJun-9-2020

Adaptation Strategies for Automated Machine Learning on Evolving Data

Celik, Bilge, Vanschoren, Joaquin

Abstract--Automated Machine Learning (AutoML) systems have been shown to efficiently build good models for new datasets. However, it is often not clear how well they can adapt when the data evolves over time. The main goal of this study is to understand the effect of data stream challenges such as concept drift on the performance of AutoML methods, and which adaptation strategies can be employed to make them more robust. To that end, we propose 6 concept drift adaptation strategies and evaluate their effectiveness on different AutoML approaches. We do this for a variety of AutoML approaches for building machine learning pipelines, including those that leverage Bayesian optimization, genetic programming, and random search with automated stacking. These are evaluated empirically on real-world and synthetic data streams with different types of concept drift. Based on this analysis, we propose ways to develop more sophisticated and robust AutoML techniques. We propose six different adaptation strategies data-driven decision making [42].

artificial intelligence, concept drift, evolutionary algorithm, (17 more...)

2006.0648

Country:

Europe > Netherlands (0.14)
Europe > United Kingdom (0.14)

Genre:

Research Report (0.64)
Instructional Material (0.46)

Industry: Education (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.66)

arXiv.org Machine LearningJul-1-2019

An Open Source AutoML Benchmark

Gijsbers, Pieter, LeDell, Erin, Thomas, Janek, Poirier, Sébastien, Bischl, Bernd, Vanschoren, Joaquin

In recent years, an active field of research has developed around automated machine learning (AutoML). Unfortunately, comparing different AutoML systems is hard and often done incorrectly. We introduce an open, ongoing, and extensible benchmark framework which follows best practices and avoids common mistakes. The framework is open-source, uses public datasets and has a website with up-to-date results. We use the framework to conduct a thorough comparison of 4 AutoML systems across 39 datasets and analyze the results.

artificial intelligence, dataset, machine learning, (16 more...)

1907.00909

Country:

Europe (0.94)
North America > United States > New York (0.15)

Genre: Research Report (0.41)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)