meme
The slopaganda era: 10 AI images posted by the White House - and what they teach us
May the 4th be with you The White House celebrates Star Wars Day. May the 4th be with you The White House celebrates Star Wars Day. Under Donald Trump, the White House has filled its social media with memes, wishcasting, nostalgia and deepfakes. Here's what you need to know to navigate the trolling I t started with an image of Trump as a king mocked up on a fake Time magazine cover. Since then it's developed into a full-blown phenomenon, one academics are calling "slopaganda" - an unholy alliance of easily available AI tools and political messaging.
- North America > Greenland (0.06)
- North America > United States > New York (0.06)
- Europe > Ukraine (0.05)
- (3 more...)
Why Everyone Is Suddenly in a 'Very Chinese Time' in Their Lives
Why Everyone Is Suddenly in a'Very Chinese Time' in Their Lives It's a symbol of what Americans believe their own country has lost. In case you didn't get the memo, everyone is feeling very Chinese these days. Across social media, people are proclaiming that "You met me at a very Chinese time of my life," while performing stereotypically Chinese-coded activities like eating dim sum or wearing the viral Adidas Chinese jacket . The trend blew up so much in recent weeks that celebrities like comedian Jimmy O Yang and influencer Hasan Piker even got in on it. It has now evolved into variations like " Chinamaxxing " (acting increasingly more Chinese) and " u will turn Chinese tomorrow " (a kind of affirmation or blessing).
- Asia > China (0.81)
- South America > Venezuela > Capital District > Caracas (0.04)
- North America > United States > California (0.04)
- (3 more...)
- Information Technology (0.95)
- Government > Regional Government (0.70)
CAuSE: Decoding Multimodal Classifiers using Faithful Natural Language Explanation
Bandyopadhyay, Dibyanayan, Bhattacharjee, Soham, Hasanuzzaman, Mohammed, Ekbal, Asif
Multimodal classifiers function as opaque black box models. While several techniques exist to interpret their predictions, very few of them are as intuitive and accessible as natural language explanations (NLEs). To build trust, such explanations must faithfully capture the classifier's internal decision making behavior, a property known as faithfulness. In this paper, we propose CAuSE (Causal Abstraction under Simulated Explanations), a novel framework to generate faithful NLEs for any pretrained multimodal classifier. We demonstrate that CAuSE generalizes across datasets and models through extensive empirical evaluations. Theoretically, we show that CAuSE, trained via interchange intervention, forms a causal abstraction of the underlying classifier. We further validate this through a redesigned metric for measuring causal faithfulness in multimodal settings. CAuSE surpasses other methods on this metric, with qualitative analysis reinforcing its advantages. We perform detailed error analysis to pinpoint the failure cases of CAuSE. For replicability, we make the codes available at https://github.com/newcodevelop/CAuSE
- Asia > Thailand > Bangkok > Bangkok (0.04)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- (3 more...)
- Transportation (0.76)
- Health & Medicine (0.46)
ExPO-HM: Learning to Explain-then-Detect for Hateful Meme Detection
Mei, Jingbiao, Sun, Mingsheng, Chen, Jinghong, Qin, Pengda, Li, Yuhong, Chen, Da, Byrne, Bill
Hateful memes have emerged as a particularly challenging form of online abuse, motivating the development of automated detection systems. Most prior approaches rely on direct detection, producing only binary predictions. Such models fail to provide the context and explanations that real-world moderation requires. Recent Explain-then-Detect approaches, using Chain-of-Thought prompting or LMM agents, perform worse than simple SFT baselines, and even advanced post-training methods such as GRPO fail to close the gap. Our analysis identifies two key issues of such systems: important policy-relevant cues such as targets and attack types are not hypothesized by the model as a likely explanation; and the binary reward signal is insufficient to guide reasoning. To address these challenges, we propose ExPO-HM (Explain-then-Detect Policy Optimization for Hateful Memes), inspired by the training and evaluation process of human annotators. ExPO-HM combines SFT warmup, GRPO with curriculum learning, and Conditional Decision Entropy (CDE) as both metric and reward for reasoning quality. Across three hateful meme benchmarks, ExPO-HM achieves state-of-the-art performance on binary detection, fine-grained classification, and reasoning quality, with up to 15\% and 17\% F1 improvement over the GRPO and DPO baselines, respectively. By moving hateful meme detection from simple binary alarms to explanation-driven detection, ExPO-HM provides accurate, interpretable, and actionable moderation support.
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
- Asia > China (0.04)
- (10 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
- (2 more...)
Enhancing Meme Emotion Understanding with Multi-Level Modality Enhancement and Dual-Stage Modal Fusion
Shi, Yi, Meng, Wenlong, Guo, Zhenyuan, Wei, Chengkun, Chen, Wenzhi
With the rapid rise of social media and Internet culture, memes have become a popular medium for expressing emotional tendencies. This has sparked growing interest in Meme Emotion Understanding (MEU), which aims to classify the emotional intent behind memes by leveraging their multimodal contents. While existing efforts have achieved promising results, two major challenges remain: (1) a lack of fine-grained multimodal fusion strategies, and (2) insufficient mining of memes' implicit meanings and background knowledge. To address these challenges, we propose MemoDetector, a novel framework for advancing MEU. First, we introduce a four-step textual enhancement module that utilizes the rich knowledge and reasoning capabilities of Multimodal Large Language Models (MLLMs) to progressively infer and extract implicit and contextual insights from memes. These enhanced texts significantly enrich the original meme contents and provide valuable guidance for downstream classification. Next, we design a dual-stage modal fusion strategy: the first stage performs shallow fusion on raw meme image and text, while the second stage deeply integrates the enhanced visual and textual features. This hierarchical fusion enables the model to better capture nuanced cross-modal emotional cues. Experiments on two datasets, MET-MEME and MOOD, demonstrate that our method consistently outperforms state-of-the-art baselines. Specifically, MemoDetector improves F1 scores by 4.3\% on MET-MEME and 3.4\% on MOOD. Further ablation studies and in-depth analyses validate the effectiveness and robustness of our approach, highlighting its strong potential for advancing MEU. Our code is available at https://github.com/singing-cat/MemoDetector.
- Research Report (1.00)
- Overview (1.00)
- Europe > Austria > Vienna (0.14)
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > Dominican Republic (0.04)
- (12 more...)
- Research Report > New Finding (0.46)
- Research Report > Experimental Study (0.46)
- Law (0.68)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.46)
- North America > United States (0.14)
- North America > Canada (0.04)
- Europe > Poland (0.04)
- Information Technology > Services (1.00)
- Law Enforcement & Public Safety > Terrorism (0.68)
TRACE: Textual Relevance Augmentation and Contextual Encoding for Multimodal Hate Detection
Koushik, Girish A., Treharne, Helen, Joshi, Aditya, Kanojia, Diptesh
Social media memes are a challenging domain for hate detection because they intertwine visual and textual cues into culturally nuanced messages. To tackle these challenges, we introduce TRACE, a hierarchical multimodal framework that leverages visually grounded context augmentation, along with a novel caption-scoring network to emphasize hate-relevant content, and parameter-efficient fine-tuning of CLIP's text encoder. Our experiments demonstrate that selectively fine-tuning deeper text encoder layers significantly enhances performance compared to simpler projection-layer fine-tuning methods. Specifically, our framework achieves state-of-the-art accuracy (0.807) and F1-score (0.806) on the widely-used Hateful Memes dataset, matching the performance of considerably larger models while maintaining efficiency. Moreover, it achieves superior generalization on the MultiOFF offensive meme dataset (F1-score 0.673), highlighting robustness across meme categories. Additional analyses confirm that robust visual grounding and nuanced text representations significantly reduce errors caused by benign confounders. We publicly release our code to facilitate future research.
- North America > United States (0.28)
- Europe > Ukraine (0.14)
- Asia > Russia (0.14)
- (6 more...)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Leveraging LLMs for Context-Aware Implicit Textual and Multimodal Hate Speech Detection
Brook, Joshua Wolfe, Markov, Ilia
This research introduces a novel approach to textual and multimodal Hate Speech Detection (HSD), using Large Language Models (LLMs) as dynamic knowledge bases to generate background context and incorporate it into the input of HSD classifiers. Two context generation strategies are examined: one focused on named entities and the other on full-text prompting. Four methods of incorporating context into the classifier input are compared: text concatenation, embedding concatenation, a hierarchical transformer-based fusion, and LLM-driven text enhancement. Experiments are conducted on the textual Latent Hatred dataset of implicit hate speech and applied in a multimodal setting on the MAMI dataset of misogynous memes. Results suggest that both the contextual information and the method by which it is incorporated are key, with gains of up to 3 and 6 F1 points on textual and multimodal setups respectively, from a zero-context baseline to the highest-performing system, based on embedding concatenation.
- North America > United States > Washington > King County > Seattle (0.14)
- Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
- Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)
- (11 more...)
- Law Enforcement & Public Safety > Terrorism (0.68)
- Law (0.68)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.67)