not-spam email
Type I and Type II Errors: What's the Difference? - KDnuggets
Let's illustrate Type I and Type II errors using a binary classification machine learning spam filter. We will assume that we have a labelled dataset of N 315 emails, 244 of which are labelled as spam, and 71 are not-spam. Supposed that we've built a machine learning classification algorithm to learn from this data. Now we would like to evaluate the performance of the machine learning model. How good was the model in correctly detecting the spam vs not-spam emails? We will assume that whenever the model predicts an email to be a spam email, the email will be deleted and saved in the spam folder.
Intro to Machine Learning & NLP with Python and Weka Codementor
In this tutorial, you'll be briefly introduced to machine learning with Python (2.x) and Weka, a data processing and machine learning tool. The activity is to build a simple spam filter for emails and learn machine learning concepts. This article is written by the Codementor team and is based on a Codementor Office Hour by Codementor Benjamin Cohen, a Data Scientist with a focus in Natural Language Processing. In a nutshell, machine learning is basically learning from data. Way back when before access to data was plentiful and access to computing power was plentiful, people tried to hand-write rules to solve a lot of problems. E.g., if you see {{some word}}, it's probably spam. That worked all right, but as problems get more and more complicated, the combinations of rules start to grow out of hand, both in terms of writing them and in terms of taking them up and processing them. The number of techniques to do this all fall under the umbrella of machine learning.