On Education Decision Trees, Random Forests, AdaBoost & XGBoost in Python - all courses

#artificialintelligence

Get a solid understanding of decision tree Understand the business scenarios where decision tree is applicable Tune a machine learning model's hyperparameters and evaluate its performance. Use Pandas DataFrames to manipulate data and make statistical computations. Use decision trees to make predictions Learn the advantage and disadvantages of the different algorithms Students will need to install Python and Anaconda software but we have a separate lecture to help you install the same You're looking for a complete Decision tree course that teaches you everything you need to create a Decision tree/ Random Forest/ XGBoost model in Python, right? You've found the right Decision Trees and tree based advanced techniques course! After completing this course you will be able to: Identify the business problem which can be solved using Decision tree/ Random Forest/ XGBoost of Machine Learning.


datas-frame – Modern Pandas (Part 8): Scaling

#artificialintelligence

We can answer questions like "Which employer's employees donated the most?" Or "what is the average amount donated per occupation?" Since Dask is lazy, we haven't actually computed anything.


Ultimate guide to handle Big Datasets for Machine Learning using Dask (in Python)

#artificialintelligence

We will now have a look at some simple cases for creating arrays using Dask. As you can see here, I had 11 values in the array and I used the chunk size as 5. This distributed my array into three chunks, where the first and second blocks have 5 values each and the third one has 1 value. Dask arrays support most of the numpy functions. For instance, you can use .sum()


Fast GeoSpatial Analysis in Python

#artificialintelligence

This work is supported by Anaconda Inc., the Data Driven Discovery Initiative from the Moore Foundation, and NASA SBIR NNX16CG43P This work is a collaboration with Joris Van den Bossche. This blogpost builds on Joris's EuroSciPy talk (slides) on the same topic. You can also see Joris' blogpost on this same topic. Python's Geospatial stack is slow. Dask gives an additional 3-4x on a multi-core laptop.


datas-frame – Scalable Machine Learning (Part 1)

#artificialintelligence

This work is supported by Anaconda Inc. and the Data Driven Discovery Initiative from the Moore Foundation. Anaconda is interested in scaling the scientific python ecosystem. My current focus is on out-of-core, parallel, and distributed machine learning. This series of posts will introduce those concepts, explore what we have available today, and track the community's efforts to push the boundaries. I am (or was, anyway) an economist, and economists like to think in terms of constraints.