dataschool
How do I encode categorical features using scikit-learn?
In order to include categorical features in your Machine Learning model, you have to encode them numerically using "dummy" or "one-hot" encoding. But how do you do this correctly using scikit-learn? In this video, you'll learn how to use OneHotEncoder and ColumnTransformer to encode your categorical features and prepare your feature matrix in a single step. You'll also learn how to include this step within a Pipeline so that you can cross-validate your model and preprocessing steps simultaneously. Finally, you'll learn why you should use scikit-learn (rather than pandas) for preprocessing your dataset.
Getting started with machine learning in Python (webcast)
Have you heard about machine learning, but you don't really understand what it's good for? Or you understand the basic idea, but you're struggling to apply it using Python? In this video, I'll explain the essential ideas behind machine learning. Then, we'll build our first machine learning model in just a few lines of code using Python's scikit-learn library. This is a recording of a webcast hosted by Trey Hunner of Weekly Python Chat: http://www.weeklypython.chat
Machine Learning with Text in scikit-learn (PyCon 2016)
Although numeric data is easy to work with in Python, most knowledge created by humans is actually raw, unstructured text. By learning how to transform text into data that is usable by machine learning models, you drastically increase the amount of data that your models can learn from. In this tutorial, we'll build and evaluate predictive models from real-world text using scikit-learn. Subscribe to the Data School newsletter: http://www.dataschool.io/subscribe/ OTHER RESOURCES My scikit-learn video series: https://www.youtube.com/playlist?list... My pandas video series: https://www.youtube.com/playlist?list... JOIN THE DATA SCHOOL COMMUNITY Blog: http://www.dataschool.io