Goto

Collaborating Authors

 ray tune


Elastic Deep Learning With Horovod On Ray - AI Summary

#artificialintelligence

Since its inception, the Ray ecosystem has grown to include a variety of features and tools useful for training ML models on the cloud, including Ray Tune for distributed hyperparameter tuning, the Ray Cluster Launcher for cluster provisioning, and load-based autoscaling . Because Ray is a general distributed compute platform, users of Ray are free to choose among a growing number of distributed data processing frameworks, including Spark, running on the same resources provisioned by Ray for the deep learning workflow. Now in the upcoming Ludwig 0.4 release, we're integrating Dask on Ray for distributed out-of-memory data preprocessing, Horovod on Ray for distributed training, and Ray Tune for hyperparameter optimization. Ludwig running in local mode (pre v0.4): all data needs to fit in memory on a single machine.Ludwig running on a Ray cluster (post v0.4): Ray scales out preprocessing and distributed training to process large datasets without needing to write any infrastructure code in Ludwig.By leveraging Dask, Ludwig's existing Pandas preprocessing can be scaled to handle large datasets with minimal code changes, and by leveraging Ray, we can combine the preprocessing, distributed training, and hyperparameter search all within a single job running a single training script.


Fast AutoML with FLAML + Ray Tune - KDnuggets

#artificialintelligence

FLAML is a lightweight Python library from Microsoft Research that finds accurate machine learning models in an efficient and economical way using cutting edge algorithms designed to be resource-efficient and easily parallelizable. FLAML can also utilize Ray Tune for distributed hyperparameter tuning to scale up these AutoML methods across a cluster. AutoML is known to be a resource and time consuming operation as it involves trials and errors to find a hyperparameter configuration with good performance. Since the space of possible configuration values is often very large, there is a need for an economical AutoML method that can more effectively search them. To address both of these factors, Microsoft Researchers have developed FLAML (Fast Lightweight AutoML).


Fast AutoML with FLAML + Ray Tune

#artificialintelligence

FLAML is a lightweight Python library from Microsoft Research that finds accurate machine learning models in an efficient and economical way using cutting edge algorithms designed to be resource-efficient and easily parallelizable. FLAML can also utilize Ray Tune for distributed hyperparameter tuning to scale up these AutoML methods across a cluster. AutoML is known to be a resource and time consuming operation as it involves trials and errors to find a hyperparameter configuration with good performance. Since the space of possible configuration values is often very large, there is a need for an economical AutoML method that can more effectively search them. To address both of these factors, Microsoft Researchers have developed FLAML (Fast Lightweight AutoML).


Amazing Low-Code Machine Learning Capabilities with New Ludwig Update - KDnuggets

#artificialintelligence

I recently started a new newsletter focus on AI education and already has over 50,000 subscribers. TheSequence is a no-BS( meaning no hype, no news etc) AI-focused newsletter that takes 5 minutes to read. The goal is to keep you up to date with machine learning projects, research papers and concepts. If you follow this blog you know I am a fan of the Ludwig open source project. Initially incubated by Uber and now part of the Linux AI Foundation, Ludwig provides one of the best low-code machine learning(ML) stacks in the current market.


Bayesian Hyperparameter Optimization with tune-sklearn in PyCaret - KDnuggets

#artificialintelligence

Here's a situation every PyCaret user is familiar with: after selecting a promising model or two from compare_models(), it's time to tune its hyperparameters to squeeze out all of the model's potential with tune_model(). By default, tune_model() uses the tried and tested RandomizedSearchCV from scikit-learn. However, not everyone knows about the various advanced options tune_model()provides. In this post, I will show you how easy it is to use other state-of-the-art algorithms with PyCaret thanks to tune-sklearn, a drop-in replacement for scikit-learn's model selection module with cutting edge hyperparameter tuning techniques. I'll also report results from a series of benchmarks, showing how tune-sklearn is able to easily improve classification model performance.


How the Integrations Between Ray & MLflow Aids Distributed ML Production

#artificialintelligence

In this blog post, we're announcing two new integrations with Ray and MLflow: Ray Tune MLflow Tracking and Ray Serve MLflow Models, which together make it much easier to build machine learning (ML) models and take them to production. These integrations are available in the latest Ray wheels. You can follow the instructions here to pip install the nightly version of Ray and take a look at the documentation to get started. They will also be in the next Ray release -- version 1.2 Our goal is to leverage the strengths of the two projects: Ray's distributed libraries for scaling training and serving and MLflow's end-to-end model lifecycle management. Let's first take a brief look at what these libraries can do before diving into the new integrations.


A Novice's Guide to Hyperparameter Optimization at Scale

#artificialintelligence

Despite the tremendous success of machine learning (ML), modern algorithms still depend on a variety of free non-trainable hyperparameters. Ultimately, our ability to select quality hyperparameters governs the performance for a given model. In the past, and even some currently, hyperparameters were hand selected through trial and error. An entire field has been dedicated to improving this selection process; it is referred to as hyperparameter optimization (HPO). Inherently, HPO requires testing many different hyperparameter configurations and as a result can benefit tremendously from massively parallel resources like the Perlmutter system we are building at the National Energy Research Scientific Computing Center (NERSC).


Ray's New Library Targets High Speed Reinforcement Learning

#artificialintelligence

Data scientists looking to push the ball forward in the field of reinforcement learning may want to check out RLlib, a new library released as open source last month by researchers affiliated with RISELab. According to researchers, the goal of RLlib is to enable users to break down the various components that go into a reinforcement learning, thereby making them more scalable, easier to integrate, and easier to resuse. Reinforcement learning is a type of supervised learning that's gaining popularity as a way to quickly train programs to perform tasks optimally in a world awash in less-than-optimal training data. Instead of training a model with pristine data, which is ideal in supervised learning, the reinforcement learning model learns from the data environment as it naturally exists, and uses a simple feedback mechanism (the reinforcement signal) to nudge the model towards the ideal solution. The practical advantage of the reinforcement approach is that it seeks to achieve a balance between being able to interpret uncharted data (which is where unsupervised learning algorithms flourish) and exploiting existing knowledge (where supervised learning typically excels).