Spam Email Detection Using Machine Learning
There are 4,825 ham and 747 spam messages. This indicates the data is imbalanced which needs to be fixed. The top ham message is "Sorry, I'll call later", whereas the top spam message is "Please call our customer service…" which occurred 30 and 4 times, respectively. First, let's create a separate dataframe for ham and spam messages and convert it to NumPy array and then to a list to generate WordCloud later. Since it is a text data, there are many unnecessary stopwords like articles, prepositions etc., which needs to be removed from the data.
Apr-21-2021, 20:17:08 GMT
- Technology: