AITopics | spam detection

Collaborating Authors

spam detection

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

9fd98f856d3ca2086168f264a117ed7c-Paper.pdf

Neural Information Processing SystemsFeb-10-2026, 07:45:21 GMT

non-uniform perturbation, perturbation, robustness, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Santa Clara (0.04)
North America > United States > California > Los Angeles County > Santa Monica (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > France (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Communications > Social Media (0.93)
Information Technology > Data Science > Data Mining (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

9fd98f856d3ca2086168f264a117ed7c-Paper.pdf

Neural Information Processing SystemsAug-16-2025, 11:26:18 GMT

artificial intelligence, data mining, machine learning, (20 more...)

Neural Information Processing Systems

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Santa Clara County > Santa Clara (0.04)
North America > United States > California > Los Angeles County > Santa Monica (0.04)
(2 more...)

Genre: Research Report (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Communications > Social Media (0.93)
Information Technology > Data Science > Data Mining (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

A Comprehensive Analysis of Adversarial Attacks against Spam Filters

Hotoğlu, Esra, Sen, Sevil, Can, Burcu

arXiv.org Artificial IntelligenceMay-8-2025

Deep learning has revolutionized email filtering, which is critical to protect users from cyber threats such as spam, malware, and phishing. However, the increasing sophistication of adversarial attacks poses a significant challenge to the effectiveness of these filters. This study investigates the impact of adversarial attacks on deep learning-based spam detection systems using real-world datasets. Six prominent deep learning models are evaluated on these datasets, analyzing attacks at the word, character sentence, and AIgenerated paragraph-levels. Novel scoring functions, including spam weights and attention weights, are introduced to improve attack effectiveness. This comprehensive analysis sheds light on the vulnerabilities of spam filters and contributes to efforts to improve their security against evolving adversarial threats. Introduction Deep learning has seen significant advancements in the field of natural language processing (NLP), particularly in tasks such as ...

machine learning, natural language, spam filtering, (18 more...)

arXiv.org Artificial Intelligence

2505.03831

Country:

North America > United States > Texas > Travis County > Austin (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > New York > New York County > New York City (0.04)
(5 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military (1.00)

Technology:

Information Technology > Security & Privacy > Spam Filtering (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Advancing Email Spam Detection: Leveraging Zero-Shot Learning and Large Language Models

SHirvani, Ghazaleh, Ghasemshirazi, Saeid

arXiv.org Artificial IntelligenceMay-6-2025

Email spam detection is a critical task in modern communication systems, essential for maintaining productivity, security, and user experience. Traditional machine learning and deep learning approaches, while effective in static settings, face significant limitations in adapting to evolving spam tactics, addressing class imbalance, and managing data scarcity. These challenges necessitate innovative approaches that reduce dependency on extensive labeled datasets and frequent retraining. This study investigates the effectiveness of Zero-Shot Learning using FLAN-T5, combined with advanced Natural Language Processing (NLP) techniques such as BERT for email spam detection. By employing BERT to preprocess and extract critical information from email content, and FLAN-T5 to classify emails in a Zero-Shot framework, the proposed approach aims to address the limitations of traditional spam detection systems. The integration of FLAN-T5 and BERT enables robust spam detection without relying on extensive labeled datasets or frequent retraining, making it highly adaptable to unseen spam patterns and adversarial environments. This research highlights the potential of leveraging zero-shot learning and NLPs for scalable and efficient spam detection, providing insights into their capability to address the dynamic and challenging nature of spam detection tasks.

large language model, machine learning, spam filtering, (19 more...)

arXiv.org Artificial Intelligence

2505.02362

Country: North America > United States (0.14)

Genre:

Research Report > Promising Solution (0.34)
Overview > Innovation (0.34)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy > Spam Filtering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Conditional Semi-Supervised Data Augmentation for Spam Message Detection with Low Resource Data

Nuha, Ulin, Lin, Chih-Hsueh

arXiv.org Artificial IntelligenceJul-6-2024

Several machine learning schemes have attempted to perform the detection of spam messages. However, those schemes mostly require a huge amount of labeled data. The existing techniques addressing the lack of data availability have issues with effectiveness and robustness. Therefore, this paper proposes a conditional semi-supervised data augmentation (CSSDA) for a spam detection model lacking the availability of data. The main architecture of CSSDA comprises feature extraction and enhanced generative network. Here, we exploit unlabeled data for data augmentation to extend training data. The enhanced generative in our proposed scheme produces latent variables as fake samples from unlabeled data through a conditional scheme. Latent variables can come from labeled and unlabeled data as the input for the final classifier in our spam detection model. The experimental results indicate that our proposed CSSDA achieves excellent results compared to several related methods both exploiting unlabeled data and not. In the experiment stage with various amounts of unlabeled data, CSSDA is the only robust model that obtains a balanced accuracy of about 85% when the availability of labeled data is large. We also conduct several ablation studies to investigate our proposed scheme in detail. The result also shows that several ablation studies strengthen our proposed innovations. These experiments indicate that unlabeled data has a significant contribution to data augmentation using the conditional semi-supervised scheme for spam detection.

data augmentation, discriminator, unlabeled data, (15 more...)

arXiv.org Artificial Intelligence

2407.0499

Country:

Asia > Taiwan > Takao Province > Kaohsiung (0.04)
Asia > Middle East > Saudi Arabia > Riyadh Province > Riyadh (0.04)
Asia > Indonesia (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Online detection and infographic explanation of spam reviews with data drift adaptation

de Arriba-Pérez, Francisco, García-Méndez, Silvia, Leal, Fátima, Malheiro, Benedita, Burguillo, J. C.

arXiv.org Artificial IntelligenceJun-21-2024

Spam reviews are a pervasive problem on online platforms due to its significant impact on reputation. However, research into spam detection in data streams is scarce. Another concern lies in their need for transparency. Consequently, this paper addresses those problems by proposing an online solution for identifying and explaining spam reviews, incorporating data drift adaptation. It integrates (i) incremental profiling, (ii) data drift detection & adaptation, and (iii) identification of spam reviews employing Machine Learning. The explainable mechanism displays a visual and textual prediction explanation in a dashboard. The best results obtained reached up to 87 % spam F-measure. Key words: Data drift, interpretability and explainability, Natural Language Processing, online Machine Learning, spam detection.

classification, detection, spam detection, (12 more...)

arXiv.org Artificial Intelligence

doi: 10.15388/24-INFOR562

2406.15038

Country:

Europe > Spain (0.05)
Europe > Portugal > Porto > Porto (0.04)
North America > United States > New York (0.04)
(2 more...)

Genre:

Overview (0.93)
Research Report > Experimental Study (0.48)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (0.68)
Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Security & Privacy > Spam Filtering (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
(4 more...)

Add feedback

Evaluating the Performance of ChatGPT for Spam Email Detection

Si, Shijing, Wu, Yuwei, Tang, Le, Zhang, Yugui, Wosik, Jedrek

arXiv.org Artificial IntelligenceJun-19-2024

Email continues to be a pivotal and extensively utilized communication medium within professional and commercial domains. Nonetheless, the prevalence of spam emails poses a significant challenge for users, disrupting their daily routines and diminishing productivity. Consequently, accurately identifying and filtering spam based on content has become crucial for cybersecurity. Recent advancements in natural language processing, particularly with large language models like ChatGPT, have shown remarkable performance in tasks such as question answering and text generation. However, its potential in spam identification remains underexplored. To fill in the gap, this study attempts to evaluate ChatGPT's capabilities for spam identification in both English and Chinese email datasets. We employ ChatGPT for spam email detection using in-context learning, which requires a prompt instruction and a few demonstrations. We also investigate how the number of demonstrations in the prompt affects the performance of ChatGPT. For comparison, we also implement five popular benchmark methods, including naive Bayes, support vector machines (SVM), logistic regression (LR), feedforward dense neural networks (DNN), and BERT classifiers. Through extensive experiments, the performance of ChatGPT is significantly worse than deep supervised learning methods in the large English dataset, while it presents superior performance on the low-resourced Chinese dataset.

chatgpt, dataset, detection, (13 more...)

arXiv.org Artificial Intelligence

2402.15537

Country:

Asia > China > Shanghai > Shanghai (0.05)
North America > United States > New Jersey (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > North Carolina (0.04)

Genre: Research Report > New Finding (0.34)

Industry:

Information Technology > Security & Privacy (1.00)
Leisure & Entertainment > Sports > Hockey (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Zero-Shot Spam Email Classification Using Pre-trained Large Language Models

Rojas-Galeano, Sergio

arXiv.org Artificial IntelligenceMay-24-2024

This paper investigates the application of pre-trained large language models (LLMs) for spam email classification using zero-shot prompting. We evaluate the performance of both open-source (Flan-T5) and proprietary LLMs (ChatGPT, GPT-4) on the well-known SpamAssassin dataset. Two classification approaches are explored: (1) truncated raw content from email subject and body, and (2) classification based on summaries generated by ChatGPT. Our empirical analysis, leveraging the entire dataset for evaluation without further training, reveals promising results. Flan-T5 achieves a 90% F1-score on the truncated content approach, while GPT-4 reaches a 95% F1-score using summaries. While these initial findings on a single dataset suggest the potential for classification pipelines of LLM-based subtasks (e.g., summarisation and classification), further validation on diverse datasets is necessary. The high operational costs of proprietary models, coupled with the general inference costs of LLMs, could significantly hinder real-world deployment for spam filtering.

classification, detection, email, (12 more...)

arXiv.org Artificial Intelligence

2405.15936

Country: South America > Colombia > Bogotá D.C. > Bogotá (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Novel Interpretable and Robust Web-based AI Platform for Phishing Email Detection

Al-Subaiey, Abdulla, Al-Thani, Mohammed, Alam, Naser Abdullah, Antora, Kaniz Fatema, Khandakar, Amith, Zaman, SM Ashfaq Uz

arXiv.org Artificial IntelligenceMay-19-2024

Phishing emails continue to pose a significant threat, causing financial losses and security breaches. This study addresses limitations in existing research, such as reliance on proprietary datasets and lack of real-world application, by proposing a high-performance machine learning model for email classification. Utilizing a comprehensive and largest available public dataset, the model achieves a f1 score of 0.99 and is designed for deployment within relevant applications. Additionally, Explainable AI (XAI) is integrated to enhance user trust. This research offers a practical and highly accurate solution, contributing to the fight against phishing by empowering users with a real-time web-based application for phishing email detection.

accuracy, dataset, email, (12 more...)

arXiv.org Artificial Intelligence

2405.11619

Country:

Asia > Middle East > Qatar > Ad-Dawhah > Doha (0.04)
Asia > Bangladesh (0.04)
South America > Brazil (0.04)
(2 more...)

Genre: Research Report > Promising Solution (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(2 more...)

Add feedback

ExplainableDetector: Exploring Transformer-based Language Modeling Approach for SMS Spam Detection with Explainability Analysis

Uddin, Mohammad Amaz, Islam, Muhammad Nazrul, Maglaras, Leandros, Janicke, Helge, Sarker, Iqbal H.

arXiv.org Artificial IntelligenceMay-12-2024

SMS, or short messaging service, is a widely used and cost-effective communication medium that has sadly turned into a haven for unwanted messages, commonly known as SMS spam. With the rapid adoption of smartphones and Internet connectivity, SMS spam has emerged as a prevalent threat. Spammers have taken notice of the significance of SMS for mobile phone users. Consequently, with the emergence of new cybersecurity threats, the number of SMS spam has expanded significantly in recent years. The unstructured format of SMS data creates significant challenges for SMS spam detection, making it more difficult to successfully fight spam attacks in the cybersecurity domain. In this work, we employ optimized and fine-tuned transformer-based Large Language Models (LLMs) to solve the problem of spam message detection. We use a benchmark SMS spam dataset for this spam detection and utilize several preprocessing techniques to get clean and noise-free data and solve the class imbalance problem using the text augmentation technique. The overall experiment showed that our optimized fine-tuned BERT (Bidirectional Encoder Representations from Transformers) variant model RoBERTa obtained high accuracy with 99.84\%. We also work with Explainable Artificial Intelligence (XAI) techniques to calculate the positive and negative coefficient scores which explore and explain the fine-tuned model transparency in this text-based spam SMS detection task. In addition, traditional Machine Learning (ML) models were also examined to compare their performance with the transformer-based models. This analysis describes how LLMs can make a good impact on complex textual-based spam data in the cybersecurity field.

dataset, explainability analysis, prediction, (14 more...)

arXiv.org Artificial Intelligence

2405.08026

Country:

Oceania > Australia > Western Australia > Perth (0.04)
Europe > Italy > Sicily (0.04)
Asia > China (0.04)
Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.04)

Genre: Research Report > New Finding (0.66)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (0.95)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback