AITopics | tidymodel

Collaborating Authors

tidymodel

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

fastml: Guarded Resampling Workflows for Safer Automated Machine Learning in R

Korkmaz, Selcuk, Goksuluk, Dincer, Karaismailoglu, Eda

arXiv.org Machine LearningApr-14-2026

Preprocessing leakage arises when scaling, imputation, or other data-dependent transformations are estimated before resampling, inflating apparent performance while remaining hard to detect. We present fastml, an R package that provides a single-call interface for leakage-aware machine learning through guarded resampling, where preprocessing is re-estimated inside each resample and applied to the corresponding assessment data. The package supports grouped and time-ordered resampling, blocks high-risk configurations, audits recipes for external dependencies, and includes sandboxed execution and integrated model explanation. We evaluate fastml with a Monte Carlo simulation contrasting global and fold-local normalization, a usability comparison with tidymodels under matched specifications, and survival benchmarks across datasets of different sizes. The simulation demonstrates that global preprocessing substantially inflates apparent performance relative to guarded resampling. fastml matched held-out performance obtained with tidymodels while reducing workflow orchestration, and it supported consistent benchmarking of multiple survival model classes through a unified interface.

artificial intelligence, fastml, machine learning, (19 more...)

arXiv.org Machine Learning

2604.05225

Country:

Europe > Netherlands > South Holland > Rotterdam (0.04)
North America > United States > Wisconsin (0.04)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.93)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science (0.93)

Add feedback

EZtune: A Package for Automated Hyperparameter Tuning in R

Lundell, Jill

arXiv.org Artificial IntelligenceMar-2-2023

Statistical learning models have been growing in popularity in recent years. Many of these models have hyperparameters that must be tuned for models to perform well. Tuning these parameters is not trivial. EZtune is an R package with a simple user interface that can tune support vector machines, adaboost, gradient boosting machines, and elastic net. We first provide a brief summary of the the models that EZtune can tune, including a discussion of each of their hyperparameters. We then compare the ease of using EZtune, caret, and tidymodels. This is followed with a comparison of the accuracy and computation times for models tuned with EZtune and tidymodels. We conclude with a demonstration of how how EZtune can be used to help select a final model with optimal predictive power. Our comparison shows that EZtune can tune support vector machines and gradient boosting machines with EZtune also provides a user interface that is easy to use for a novice to statistical learning models or R.

artificial intelligence, eztune, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2303.12177

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > Utah (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre:

Research Report (0.50)
Instructional Material (0.46)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Add feedback

Optimizing Machine Learning Workflows with Tidymodels

#artificialintelligenceJan-5-2023, 17:25:25 GMT

Tidymodels is a package that is designed to streamline machine learning workflows in R. It consists of a suite of packages that can be used to pre-process data, build and tune machine learning models, and evaluate their performance. Tidymodels is particularly useful for those who are new to machine learning, as it provides an easy-to-use interface for building and evaluating models, and it can be used to quickly iterate through different model architectures and parameters. To get started with tidymodels, you will need to install the package and its dependencies. Once you have installed tidymodels, you can start using it in your machine learning workflows. Before building a machine learning model, it is often necessary to pre-process the data to ensure that it is in a suitable format.

artificial intelligence, machine learning, tidymodel, (4 more...)

#artificialintelligence

Genre: Workflow (0.92)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Introducing random forests in R

#artificialintelligenceJul-25-2022, 11:21:15 GMT

In this post, I will present how to use random forests in classification, a prediction technique consisting in generating a set of trees (hence, a forest) bootstrapping the features used in each tree. We do this to obtain trees that are not necessarily using the strongest predictors at the beginning. I will test this technique in a LoanDefaults dataset to predict which customers will default the paying of a loan in a specific month. This dataset has two interesting features: the number of positive cases is much smaller than the negatives and requires some preprocessing of the existing features. I will be using the ranger (RANdom forest GEneRator) package, skimr to get a summary of data, rpart and rpart.plot to generate an alternative decision tree model, BAdatasets to access the dataset, tidymodels for prediction workflow facilities and forcats for the variable importance plot.

decision tree, prediction, random forest, (15 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.96)

Add feedback

Tidy Time Series Forecasting in R with Spark

#artificialintelligenceOct-20-2021, 15:24:54 GMT

I'm SUPER EXCITED to show fellow time-series enthusiasts a new way that we can scale time series analysis using an amazing technology called Spark! Without Spark, large-scale forecasting projects of 10,000 time series can take days to run because of long-running for-loops and the need to test many models on each time series. Spark has been widely accepted as a "big data" solution, and we'll use it to scale-out (distribute) our time series analysis to Spark Clusters, and run our analysis in parallel. Spark is an amazing technology for processing large-scale data science workloads. Modeltime is a state-of-the-art forecasting library that I personally developed for "Tidy Forecasting" in R. Modeltime now integrates a Spark Backend with capability of forecasting 10,000 time series using distributed Spark Clusters.

forecasting, modeltime, time sery, (10 more...)

#artificialintelligence

Genre: Instructional Material > Course Syllabus & Notes (0.48)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.58)

Add feedback

Introducing Modeltime: Tidy Time Series Forecasting using Tidymodels

#artificialintelligenceJun-17-2021, 12:55:45 GMT

I'm beyond excited to introduce modeltime, a new time series forecasting package designed to speed up model evaluation, selection, and forecasting. Follow the updated modeltime article to get started with modeltime. If you like what you see, I have an Advanced Time Series Course where you will become the time-series expert for your organization by learning modeltime and timetk. This article is part of a series of software announcements on the Modeltime Forecasting Ecosystem. Register to stay in the know on new cutting-edge R software like modeltime.

forecasting, modeltime, time series forecasting, (12 more...)

#artificialintelligence

Country: North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.11)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.31)

Add feedback

How to Use Catboost with Tidymodels

#artificialintelligenceAug-30-2020, 07:41:05 GMT

So you want to compete in a kaggle competition with R and you want to use tidymodels. In this howto I show how you can use CatBoost with tidymodels. I give very terse descriptions of what the steps do, because I believe you read this post for implementation, not background on how the elements work. This tutorial is extremely similar to my previous post about using lightGBM with Tidymodels. It is a unified machine learning framework that uses sane defaults, keeps model definitions andimplementation separate and allows you to easily swap models or change parts of the processing.

artificial intelligence, machine learning, tidymodel, (15 more...)

#artificialintelligence

Country: North America > United States > Iowa (0.16)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Tidymodels

#artificialintelligenceApr-25-2020, 07:27:54 GMT

The tidymodels framework is a collection of packages for modeling and machine learning using tidyverse principles.

tidymodel

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.74)

Add feedback

Tidymodels: tidy machine learning in R

#artificialintelligenceApr-14-2020, 23:04:50 GMT

Over the past few years, tidymodels has been gradually emerging as the tidyverse's machine learning toolkit. Well, it turns out that R has a consistency problem. Since everything was made by different people and using different principles, everything has a slightly different interface, and trying to keep everything in line can be frustrating. Several years ago, Max Kuhn (formerly at Pfeizer, now at RStudio) developed the caret R package (see my caret tutorial) aimed at creating a uniform interface for the massive variety of machine learning models that exist in R. Caret was great in a lot of ways, but also limited in others. In my own use, I found it to be quite slow whenever I tried to use on problems of any kind of modest size.

recipe, tidymodel, workflow, (14 more...)

#artificialintelligence

Country: North America > United States > Arizona (0.04)

Industry: Health & Medicine (0.52)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

Customer Churn Modeling using Machine Learning with parsnip

#artificialintelligenceDec-5-2019, 19:09:30 GMT

This article comes from Diego Usai, a student in Business Science University. Diego has completed both 101 (Data Science Foundations) and 201 (Advanced Machine Learning & Business Consulting) courses. Diego shows off his progress in this Customer Churn Tutorial using Machine Learning with parsnip. Diego originally posted the article on his personal website, diegousai.io, Recently I have completed the online course Business Analysis With R focused on applied data and business science with R, which introduced me to a couple of new modelling concepts and approaches.

customer, machine learning, parsnip, (13 more...)

#artificialintelligence

Genre: Instructional Material (0.47)

Industry:

Education > Educational Setting > Online (0.75)
Education > Educational Technology > Educational Software > Computer Based Training (0.35)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback