AITopics | clickbait

Collaborating Authors

clickbait

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

An Interpretable Benchmark for Clickbait Detection and Tactic Attribution

Nofar, Lihi, Portal, Tomer, Elbaz, Aviv, Apartsin, Alexander, Aperstein, Yehudit

arXiv.org Artificial IntelligenceSep-16-2025

The proliferation of clickbait headlines poses significant challenges to the credibility of information and user trust in digital media. While recent advances in machine learning have improved the detection of manipulative content, the lack of explainability limits their practical adoption. This paper presents a model for explainable clickbait detection that not only identifies clickbait titles but also attributes them to specific linguistic manipulation strategies. We introduce a synthetic dataset generated by systematically augmenting real news headlines using a predefined catalogue of clickbait strategies. This dataset enables controlled experimentation and detailed analysis of model behaviour. We present a two - stage framework for automatic clickbait analysis comprising detection and tactic attribution. In the first stage, we compare a fine - tuned BERT classifier with large language models (LLMs), specifically GPT - 4.0 and Gemini 2.4 Flash, under both zero - shot prompting and few - shot prompting enriched with illustrative clickbait headlines and their associated persuasive tactics. In the second stage, a dedicated BERT - based classifier predicts the specific clickbait strategies present in each headline. We share the dataset with the research community at https://github.com/LLM - HITCS25S/ClickbaitTacticsDetection The widespread use of clickbait headlines in digital media has become a pervasive challenge, undermining the credibility of information and exploiting user attention through manipulative linguistic techniques. While automated systems for detecting clickbait have improved in recent years, their focus has remained mainly on binary classification, simply labelling content as clickbait or not. However, effective mitigation of such content requires going beyond detection to understanding how and why certain headlines manipulate readers. Specifically, it is crucial to evaluate whether current AI models can accurately recognize and distinguish the diverse linguistic styles and persuasive strategies commonly employed in clickbait.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2509.10937

Country: Asia > Middle East (0.28)

Genre: Research Report (0.82)

Industry:

Marketing (1.00)
Media > News (0.71)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

What Makes You CLIC: Detection of Croatian Clickbait Headlines

Anđelić, Marija, Šipek, Dominik, Majer, Laura, Šnajder, Jan

arXiv.org Artificial IntelligenceJul-25-2025

Online news outlets operate predominantly on an advertising-based revenue model, compelling journalists to create headlines that are often scandalous, intriguing, and provocative -- commonly referred to as clickbait. Automatic detection of clickbait headlines is essential for preserving information quality and reader trust in digital media and requires both contextual understanding and world knowledge. For this task, particularly in less-resourced languages, it remains unclear whether fine-tuned methods or in-context learning (ICL) yield better results. In this paper, we compile CLIC, a novel dataset for clickbait detection of Croatian news headlines spanning a 20-year period and encompassing mainstream and fringe outlets. We fine-tune the BERTić model on this task and compare its performance to LLM-based ICL methods with prompts both in Croatian and English. Finally, we analyze the linguistic properties of clickbait. We find that nearly half of the analyzed headlines contain clickbait, and that finetuned models deliver better results than general LLMs.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2507.14314

Country:

Europe (1.00)
North America > United States (0.47)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Industry:

Media > News (1.00)
Marketing (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Communications > Social Media (0.94)

Add feedback

Te Ahorré Un Click: A Revised Definition of Clickbait and Detection in Spanish News

Mordecki, Gabriel, Moncecchi, Guillermo, Couto, Javier

arXiv.org Artificial IntelligenceJul-15-2025

We revise the definition of clickbait, which lacks current consensus, and argue that the creation of a curiosity gap is the key concept that distinguishes clickbait from other related phenomena such as sensationalism and headlines that do not deliver what they promise or diverge from the article. Therefore, we propose a new definition: clickbait is a technique for generating headlines and teasers that deliberately omit part of the information with the goal of raising the readers' curiosity, capturing their attention and enticing them to click. We introduce a new approach to clickbait detection datasets creation, by refining the concept limits and annotations criteria, minimizing the subjectivity in the decision as much as possible. Following it, we created and release TA1C (for Te Ahorré Un Click, Spanish for Saved You A Click), the first open source dataset for clickbait detection in Spanish. It consists of 3,500 tweets coming from 18 well known media sources, manually annotated and reaching a 0.825 Fleiss' κ inter annotator agreement. We implement strong baselines that achieve 0.84 in F1-score.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-80366-6_32

2507.09777

Country:

South America (0.68)
Europe > Spain (0.46)
North America > United States > New York (0.14)

Genre: Research Report (0.82)

Industry:

Media > News (1.00)
Marketing (1.00)
Leisure & Entertainment > Sports (1.00)
(2 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Baitradar: A Multi-Model Clickbait Detection Algorithm Using Deep Learning

Gamage, Bhanuka, Labib, Adnan, Joomun, Aisha, Lim, Chern Hong, Wong, KokSheik

arXiv.org Artificial IntelligenceMay-26-2025

Following the rising popularity of YouTube, there is an emerging problem on this platform called clickbait, which provokes users to click on videos using attractive titles and thumbnails. As a result, users ended up watching a video that does not have the content as publicized in the title. This issue is addressed in this study by proposing an algorithm called BaitRadar, which uses a deep learning technique where six inference models are jointly consulted to make the final classification decision. These models focus on different attributes of the video, including title, comments, thumbnail, tags, video statistics and audio transcript. The final classification is attained by computing the average of multiple models to provide a robust and accurate output even in situation where there is missing data. The proposed method is tested on 1,400 YouTube videos. On average, a test accuracy of 98% is achieved with an inference time of less than 2s.

artificial intelligence, machine learning, video, (17 more...)

arXiv.org Artificial Intelligence

2505.17448

Genre: Research Report > New Finding (0.34)

Industry:

Marketing (0.99)
Media > News (0.46)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Multimodal Clickbait Detection by De-confounding Biases Using Causal Representation Inference

Yu, Jianxing, Wang, Shiqi, Yin, Han, Sun, Zhenlong, Xie, Ruobing, Zhang, Bo, Rao, Yanghui

arXiv.org Artificial IntelligenceOct-10-2024

This paper focuses on detecting clickbait posts on the Web. These posts often use eye-catching disinformation in mixed modalities to mislead users to click for profit. That affects the user experience and thus would be blocked by content provider. To escape detection, malicious creators use tricks to add some irrelevant non-bait content into bait posts, dressing them up as legal to fool the detector. This content often has biased relations with non-bait labels, yet traditional detectors tend to make predictions based on simple co-occurrence rather than grasping inherent factors that lead to malicious behavior. This spurious bias would easily cause misjudgments. To address this problem, we propose a new debiased method based on causal inference. We first employ a set of features in multiple modalities to characterize the posts. Considering these features are often mixed up with unknown biases, we then disentangle three kinds of latent factors from them, including the invariant factor that indicates intrinsic bait intention; the causal factor which reflects deceptive patterns in a certain scenario, and non-causal noise. By eliminating the noise that causes bias, we can use invariant and causal factors to build a robust model with good generalization ability. Experiments on three popular datasets show the effectiveness of our approach.

detection, proceedings, scenario, (15 more...)

arXiv.org Artificial Intelligence

2410.07673

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Austria > Vienna (0.14)
(21 more...)

Genre: Research Report > Experimental Study (0.68)

Industry: Media > News (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(5 more...)

Add feedback

What Drives Online Popularity: Author, Content or Sharers? Estimating Spread Dynamics with Bayesian Mixture Hawkes

Calderon, Pio, Rizoiu, Marian-Andrei

arXiv.org Artificial IntelligenceJun-12-2024

The spread of content on social media is shaped by intertwining factors on three levels: the source, the content itself, and the pathways of content spread. At the lowest level, the popularity of the sharing user determines its eventual reach. However, higher-level factors such as the nature of the online item and the credibility of its source also play crucial roles in determining how widely and rapidly the online item spreads. In this work, we propose the Bayesian Mixture Hawkes (BMH) model to jointly learn the influence of source, content and spread. We formulate the BMH model as a hierarchical mixture model of separable Hawkes processes, accommodating different classes of Hawkes dynamics and the influence of feature sets on these classes. We test the BMH model on two learning tasks, cold-start popularity prediction and temporal profile generalization performance, applying to two real-world retweet cascade datasets referencing articles from controversial and traditional media publishers. The BMH model outperforms the state-of-the-art models and predictive baselines on both datasets and utilizes cascade- and item-level information better than the alternatives. Lastly, we perform a counter-factual analysis where we apply the trained publisher-level BMH models to a set of article headlines and show that effectiveness of headline writing style (neutral, clickbait, inflammatory) varies across publishers. The BMH model unveils differences in style effectiveness between controversial and reputable publishers, where we find clickbait to be notably more effective for reputable publishers as opposed to controversial ones, which links to the latter's overuse of clickbait.

bmh model, cascade, publisher, (16 more...)

arXiv.org Artificial Intelligence

2406.0339

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States (0.04)
Europe > United Kingdom (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Media > News (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Generating clickbait spoilers with an ensemble of large language models

Woźny, Mateusz, Lango, Mateusz

arXiv.org Artificial IntelligenceMay-25-2024

Clickbait posts are a widespread problem in the webspace. The generation of spoilers, i.e. short texts that neutralize clickbait by providing information that satisfies the curiosity induced by it, is one of the proposed solutions to the problem. Current state-of-the-art methods are based on passage retrieval or question answering approaches and are limited to generating spoilers only in the form of a phrase or a passage. In this work, we propose an ensemble of fine-tuned large language models for clickbait spoiler generation. Our approach is not limited to phrase or passage spoilers, but is also able to generate multipart spoilers that refer to several non-consecutive parts of text. Experimental evaluation demonstrates that the proposed ensemble model outperforms the baselines in terms of BLEU, METEOR and BERTScore metrics.

clickbait, ensemble, spoiler, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.18653/v1/2023.inlg-main.32

2405.16284

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Poland > Greater Poland Province > Poznań (0.05)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
(3 more...)

Genre: Research Report > Promising Solution (0.34)

Industry: Marketing (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Mitigating Clickbait: An Approach to Spoiler Generation Using Multitask Learning

Pal, Sayantan, Das, Souvik, Srihari, Rohini K.

arXiv.org Artificial IntelligenceMay-7-2024

This study introduces 'clickbait spoiling', a novel technique designed to detect, categorize, and generate spoilers as succinct text responses, countering the curiosity induced by clickbait content. By leveraging a multi-task learning framework, our model's generalization capabilities are significantly enhanced, effectively addressing the pervasive issue of clickbait. The crux of our research lies in generating appropriate spoilers, be it a phrase, an extended passage, or multiple, depending on the spoiler type required. Our methodology integrates two crucial techniques: a refined spoiler categorization method and a modified version of the Question Answering (QA) mechanism, incorporated within a multi-task learning paradigm for optimized spoiler extraction from context. Notably, we have included fine-tuning methods for models capable of handling longer sequences to accommodate the generation of extended spoilers. This research highlights the potential of sophisticated text processing techniques in tackling the omnipresent issue of clickbait, promising an enhanced user experience in the digital realm.

clickbait, computational linguistic, spoiler, (13 more...)

arXiv.org Artificial Intelligence

2405.04292

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China (0.04)
North America > United States > New York > New York County > New York City (0.04)
(5 more...)

Genre: Research Report > Experimental Study (0.47)

Industry: Marketing (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.51)

Add feedback

Maintaining Journalistic Integrity in the Digital Age: A Comprehensive NLP Framework for Evaluating Online News Content

Bojic, Ljubisa, Prodanovic, Nikola, Samala, Agariadne Dwinggo

arXiv.org Artificial IntelligenceJan-7-2024

The rapid growth of online news platforms has led to an increased need for reliable methods to evaluate the quality and credibility of news articles. This paper proposes a comprehensive framework to analyze online news texts using natural language processing (NLP) techniques, particularly a language model specifically trained for this purpose, alongside other well-established NLP methods. The framework incorporates ten journalism standards-objectivity, balance and fairness, readability and clarity, sensationalism and clickbait, ethical considerations, public interest and value, source credibility, relevance and timeliness, factual accuracy, and attribution and transparency-to assess the quality of news articles. By establishing these standards, researchers, media organizations, and readers can better evaluate and understand the content they consume and produce. The proposed method has some limitations, such as potential difficulty in detecting subtle biases and the need for continuous updating of the language model to keep pace with evolving language patterns.

credibility, information, news content, (15 more...)

arXiv.org Artificial Intelligence

2401.03467

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > Serbia > Vojvodina > South Bačka District > Novi Sad (0.04)
North America > United States > New York (0.04)
(8 more...)

Genre: Research Report (1.00)

Industry: Media > News (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)

Add feedback

Not all Fake News is Written: A Dataset and Analysis of Misleading Video Headlines

Sung, Yoo Yeon, Boyd-Graber, Jordan, Hassan, Naeemul

arXiv.org Artificial IntelligenceDec-14-2023

Polarization and the marketplace for impressions have conspired to make navigating information online difficult for users, and while there has been a significant effort to detect false or misleading text, multimodal datasets have received considerably less attention. To complement existing resources, we present multimodal Video Misleading Headline (VMH), a dataset that consists of videos and whether annotators believe the headline is representative of the video's contents. After collecting and annotating this dataset, we analyze multimodal baselines for detecting misleading headlines. Our annotation process also focuses on why annotators view a video as misleading, allowing us to better understand the interplay of annotators' background and the content of the videos.

annotator, information, video, (15 more...)

arXiv.org Artificial Intelligence

2310.13859

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > Maryland (0.04)
(9 more...)

Genre: Research Report > Experimental Study (0.68)

Industry:

Media > News (1.00)
Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback