Goto

Collaborating Authors

 ham message


Machine Learning Driven Smishing Detection Framework for Mobile Security

arXiv.org Artificial Intelligence

The increasing reliance on smartphones for communication, financial transactions, and personal data management has made them prime targets for cyberattacks, particularly smishing, a sophisticated variant of phishing conducted via SMS. Despite the growing threat, traditional detection methods often struggle with the informal and evolving nature of SMS language, which includes abbreviations, slang, and short forms. This paper presents an enhanced content-based smishing detection framework that leverages advanced text normalization techniques to improve detection accuracy. By converting nonstandard text into its standardized form, the proposed model enhances the efficacy of machine learning classifiers, particularly the Naive Bayesian classifier, in distinguishing smishing messages from legitimate ones. Our experimental results, validated on a publicly available dataset, demonstrate a detection accuracy of 96.2%, with a low False Positive Rate of 3.87% and False Negative Rate of 2.85%. This approach significantly outperforms existing methodologies, providing a robust solution to the increasingly sophisticated threat of smishing in the mobile environment.


Spam Email Detection Using Machine Learning

#artificialintelligence

There are 4,825 ham and 747 spam messages. This indicates the data is imbalanced which needs to be fixed. The top ham message is "Sorry, I'll call later", whereas the top spam message is "Please call our customer service…" which occurred 30 and 4 times, respectively. First, let's create a separate dataframe for ham and spam messages and convert it to NumPy array and then to a list to generate WordCloud later. Since it is a text data, there are many unnecessary stopwords like articles, prepositions etc., which needs to be removed from the data.