Learning to Classify with Branching Tests: "A decision tree takes as input an object or situation described by a set of properties, and outputs a yes/no decision. Decision trees therefore represent Boolean functions. Functions with a larger range of outputs can also be represented...."
– Artificial Intelligence: A Modern Approach. By Stuart Russell & Peter Norvig. 2002. Section 18.3; page 531.
In this tutorial, I will demonstrate Orange, a tool for machine learning. Orange is an extremely easy-to-use, lightweight, drag-and-drop tool. More importantly, it is open source! If you are an Anaconda user, then you can find it in the console as shown in the following image -- a pure, fresh orange wearing sunglasses with a smile. Orange is a platform built for creating machine learning pipelines on a GUI workflow.
This post is'not' intended to teach people how to use popular predictive modelling APIs for free. Although, to your surprise, this isn't a far fetched possibility. Trained Machine learning models are basically a function that maps feature vectors to the output variable. Upon querying with a test instance, the model predicts an outcome, assigning probability scores to all the possible classes. Google, Amazon etc provides public facing APIs to train predictive models on the subscriber's data, the model can further be used for prediction purposes .
Explainable AI or XAI is a sub-category of AI where the decisions made by the model can be interpreted by humans, as opposed to "black box" models. As AI moves from correcting our spelling and targeting ads to driving our cars and diagnosing patients, the need to verify and justify the conclusions being reached is beginning to be prioritised.
I did a series of blog posts on different machine learning techniques recently, which sparked a lot of interest. You can see part 1, part 2, and part 3 if you want to learn about classification, clustering, regression, and so on. In that series I was careful to differentiate between a general technique and a specific algorithm like decision trees. Classification, for example, is a general technique used to identify members of a known class like fraudulent transactions, bananas, or high value customers. Read this machine learning post if you need a refresher or are wondering quite what bananas have to do with machine learning.
Decision Tree Model building is one of the most applied technique in analytics vertical. The decision tree model is quick to develop and easy to understand. The technique is simple to learn. A number of business scenarios in lending business / telecom / automobile etc. require decision tree model building. How long the course should take?