Collaborating Authors

Building Machine Learning Models to Solve Practical Problems - Simple Talk


Machine learning has been reshaping our lives for quite a while now. Be it the smallest thing such as unlocking your phone through Face Recognition to useful interactions with Siri, Alexa, Cortana, or Google using Speech Recognition, machine learning is everywhere! In this article, I am going to provide a brief overview of machine learning and data science. With a basic understanding of these concepts, you can dive deeper into the details of linear regression and how you can build a machine learning model that will help you to solve many practical problems. The article will focus on building a Linear Regression model for Movie Budget data using various modules in Python.



Run a docker container with Tribuo (a ML Library, written in Java), running under the traditional Java 11 (from OpenJDK or another source) or GraalVM. Run the Classification or Regression example as illustrated in the tutorial notebooks: Classification Regression as a Java App run from the CLI. See detailed Classification output and Regression output. You may some warnings and then get a binary to use (instead of the jar), see native-image build output and the native-image overall run output. You should see the same Classification output and Regression output as before.

Advanced data exploration and modeling with Spark


This walkthrough uses HDInsight Spark to do data exploration and train binary classification and regression models using cross-validation and hyperparameter optimization on a sample of the NYC taxi trip and fare 2013 dataset. It walks you through the steps of the Data Science Process, end-to-end, using an HDInsight Spark cluster for processing and Azure blobs to store the data and the models. The process explores and visualizes data brought in from an Azure Storage Blob and then prepares the data to build predictive models. Python has been used to code the solution and to show the relevant plots. These models are build using the Spark MLlib toolkit to do binary classification and regression modeling tasks.



The project is written in python 2.7 and Pytorch 1.0.1. If CUDA is available, it will be used automatically. The models can also run on CPU as they are not excessively big. We carried out homoscedastic and heteroscedastic regression experiements on toy datasets, generated with (Gaussian Process ground truth), as well as on real data (six UCI datasets). The heteroscedastic notebooks contain both toy and UCI dataset experiments for a given (ModelName).

Machine Learning: Regression of 911 Calls


Welcome to my first Machine Learning project! This post will be focused on solving the real problem. We will try to predict location and daily quantity of 911 calls in US Montgomery County (PA). Such project could enable help & rescue teams, such as police, fire department and emergency medical services to prepare for upcoming events and better plan their work. The intro story is totally made up.