AITopics | caret

Collaborating Authors

caret

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learn tidymodels with my supervised machine learning course

#artificialintelligenceApr-10-2023, 02:00:32 GMT

Today I am happy to announce that a new tidymodels-centric version of my free, online, interactive course, Supervised Machine Learning: Case Studies in R, has been published! This is at least the third version of this course I've built at this point but I believe it to be the best, in terms of how it communicates machine learning concepts and how useful to your real-world problems the demonstrated code will be. Similar to the last time I launched this course, it provides four case studies using data from the real world for you to practice your predictive modeling skills. One question we sometimes field from R users is about choosing to use tidymodels vs. caret. The original version of my course mostly used caret, and caret is a stable and broadly used framework for modeling and machine learning in R.

learn tidymodel, original version, supervised machine, (2 more...)

#artificialintelligence

Genre: Instructional Material > Course Syllabus & Notes (0.59)

Industry: Education > Educational Setting > Online (0.95)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

TSML (Time Series Machine Learnng)

Palmes, Paulito, Ploennigs, Joern, Brady, Niall

arXiv.org Machine LearningMay-27-2020

Over the past years, the industrial sector has seen many innovations brought about by automation. Inherent in this automation is the installation of sensor networks for status monitoring and data collection. One of the major challenges in these data-rich environments is how to extract and exploit information from these large volume of data to detect anomalies, discover patterns to reduce downtimes and manufacturing errors, reduce energy usage, predict faults/failures, effective maintenance schedules, etc. To address these issues, we developed TSML. Its technology is based on using the pipeline of lightweight filters as building blocks to process huge amount of industrial time series data in parallel.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Machine Learning

2005.13191

Genre: Research Report (0.64)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Data Science > Data Mining > Anomaly Detection (0.66)

Add feedback

Automatic Machine Learning Derived from Scholarly Big Data

Greenstein-Messica, Asnat, Vainshtein, Roman, Katz, Gilad, Shapira, Bracha, Rokach, Lior

arXiv.org Machine LearningMar-6-2020

One of the challenging aspects of applying machine learning is the need to identify the algorithms that will perform best for a given dataset. This process can be difficult, time consuming and often requires a great deal of domain knowledge. We present Sommelier, an expert system for recommending the machine learning algorithms that should be applied on a previously unseen dataset. Sommelier is based on word embedding representations of the domain knowledge extracted from a large corpus of academic publications. When presented with a new dataset and its problem description, Sommelier leverages a recommendation model trained on the word embedding representation to provide a ranked list of the most relevant algorithms to be used on the dataset. We demonstrate Sommelier's effectiveness by conducting an extensive evaluation on 121 publicly available datasets and 53 classification algorithms. The top algorithms recommended for each dataset by Sommelier were able to achieve on average 97.7% of the optimal accuracy of all surveyed algorithms.

algorithm, dataset, weka, (15 more...)

arXiv.org Machine Learning

2003.0347

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > New York > New York County > New York City (0.04)
Asia > Middle East > Israel (0.04)

Genre: Research Report > New Finding (0.93)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.67)
(2 more...)

Add feedback

Walkthrough of the dummyVars function from the {caret} package: Machine Learning with R

#artificialintelligenceDec-7-2019, 07:07:45 GMT

Walkthrough of the dummyVars function from the {caret} package: Machine Learning with R MORE: Signup for my newsletter and more: http://www.viralml.com My books on Amazon: The Little Book of Fundamental Indicators: Hands-On Market Analysis with Python: Find Your Market Bearings with Python, Jupyter Notebooks, and Freely Available Data: https://amzn.to/2DERG3d Create Income Streams with Online Classes: Design Classes That Generate Long-Term Revenue: https://amzn.to/2VToEHK

dummyvar function, machine learning, walkthrough, (5 more...)

#artificialintelligence

Industry: Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.72)
Information Technology > Communications > Social Media (0.50)

Add feedback

Caret Package - A Practical Guide to Machine Learning in R

#artificialintelligenceMar-13-2018, 23:15:40 GMT

Caret Package is a comprehensive framework for building machine learning models in R. In this tutorial, I explain nearly all the core features of the caret package and walk you through the step-by-step process of building predictive models. Be it a decision tree or xgboost, caret helps to find the optimal model in the shortest possible time. Caret nicely integrates all the activities associated with the model development in a streamlined workflow, for nearly every major ML algorithm available in R. Actually we will not just stop with the caret package but will also go a step ahead and see how to smartly ensemble predictions from multiple best models and possibly produce an even better prediction using caretEnsemble. Caret is short for Classification And REgression Training. With R having so many implementations of machine learning algorithms, spread across packages it may be challenging to keep track of which algorithm resides in which package. Sometimes the syntax and the way to implement the algorithm differ across packages combined with preprocessing and looking at the help page for the hyperparameters (parameters that define how the algorithm learns) can make building predictive models an involved task. Well, thanks to caret because no matter which package the algorithm resides, caret will remember that for you and may just prompt you to run install.package Later in this tutorial I will show how to see all the available ML algorithms supported by caret (it's a long list!) and what hyperparameters can be tuned.

algorithm, artificial intelligence, machine learning, (17 more...)

#artificialintelligence

Genre: Workflow (0.89)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.90)

Add feedback

Machine Learning with R: An Irresponsibly Fast Tutorial

#artificialintelligenceJun-20-2017, 14:21:24 GMT

As I said in Becoming a data hacker, R is an awesome programming language for data analysts, especially for people just getting started. In this post, I will give you a super quick, very practical, theory-free, hands-on intro to writing a simple classification model in R, using the caret package. If you want to skip the tutorial, you can find the R code here. Quick note: if the code examples look weird for you on mobile, give it a try on a desktop (you can't do the tutorial on your phone, anyway!). One of the biggest barriers to learning for budding data scientists is that there are so many different R packages for machine learning.

artificial intelligence, machine learning, prediction, (14 more...)

#artificialintelligence

Genre: Instructional Material (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Encoding categorical variables: one-hot and beyond

#artificialintelligenceApr-15-2017, 19:30:21 GMT

R has "one-hot" encoding hidden in most of its modeling paths. Asking an R user where one-hot encoding is used is like asking a fish where there is water; they can't point to it as it is everywhere. Much of the encoding in R is essentially based on "contrasts" implemented in stats::model.matrix() Note: do not use base::data.matrix() The above mal-coding can be a critical flaw when you are building a model and then later using the model on new data (be it cross-validation data, test data, or future application data). Many R users are not familiar with the above issue as encoding is hidden in model training, and how to encode new data is stored as part of the model.

artificial intelligence, encoding categorical variable, machine learning, (13 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Encoding categorical variables: one-hot and beyond

#artificialintelligenceApr-15-2017, 17:45:33 GMT

artificial intelligence, machine learning, matrix, (12 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

9 nifty Gboard for Android tricks you need to try

PCWorldApr-5-2017, 10:43:39 GMT

The only problem with Google's Gboard keyboard for Android is that I'm totally hooked on its best features. Read on for nine of the niftiest Gboard features, from dedicated number rows and an on-demand numeric keypad to "neural" translations and a long-press shortcut for oft-used symbols. Note: Yes, there's also a version of Gboard for iOS, but most of my favorite Gboard tricks only work on the Android version. Tapping a virtual keypad with a single thumb can be something of a stretch if your phone has a massive screen. Luckily, Gboard has a clever feature that makes it easier to tap with just one hand.

artificial intelligence, gboard, natural language, (19 more...)

PCWorld

Technology:

Information Technology > Communications > Mobile (0.84)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.51)

Add feedback

Propensity score prediction for electronic healthcare databases using Super Learner and High-dimensional Propensity Score Methods

Ju, Cheng, Combs, Mary, Lendle, Samuel D, Franklin, Jessica M, Wyss, Richard, Schneeweiss, Sebastian, van der Laan, Mark J.

arXiv.org Machine LearningMar-14-2017

The optimal learner for prediction modeling varies depending on the underlying data-generating distribution. Super Learner (SL) is a generic ensemble learning algorithm that uses cross-validation to select among a "library" of candidate prediction models. The SL is not restricted to a single prediction model, but uses the strengths of a variety of learning algorithms to adapt to different databases. While the SL has been shown to perform well in a number of settings, it has not been thoroughly evaluated in large electronic healthcare databases that are common in pharmacoepidemiology and comparative effectiveness research. In this study, we applied and evaluated the performance of the SL in its ability to predict treatment assignment using three electronic healthcare databases. We considered a library of algorithms that consisted of both nonparametric and parametric models. We also considered a novel strategy for prediction modeling that combines the SL with the high-dimensional propensity score (hdPS) variable selection algorithm. Predictive performance was assessed using three metrics: the negative log-likelihood, area under the curve (AUC), and time complexity. Results showed that the best individual algorithm, in terms of predictive performance, varied across datasets. The SL was able to adapt to the given dataset and optimize predictive performance relative to any individual learner. Combining the SL with the hdPS was the most consistent prediction method and may be promising for PS estimation and prediction modeling in electronic healthcare databases.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Machine Learning

1703.02236

Country: North America > United States (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.94)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Health Care Providers & Services (0.94)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.46)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback