authenticity
Truth over Tricks: Measuring and Mitigating Shortcut Learning in Misinformation Detection
Misinformation detectors often rely on superficial cues (i.e., shortcuts) that correlate with misinformation in training data but fail to generalize to the diverse and evolving nature of real-world misinformation. This issue is exacerbated by large language models (LLMs), which can easily generate convincing misinformation using simple prompts. We introduce TRUTHOVERTRICKS, a unified evaluation paradigm for measuring shortcut learning in misinformation detection. TRUTHOVERTRICKS categorizes shortcut behaviors into intrinsic shortcut induction and extrinsic shortcut injection, and evaluates seven representative detectors across 14 popular benchmarks, along with two new factual misinformation datasets, NQ-Misinfo and Streaming-Misinfo. Empirical results reveal that existing detectors suffer severe performance degradation when exposed to both naturally occurring and adversarially crafted shortcuts. To address this, we propose the Shortcut Mitigation Framework (SMF), an LLM-augmented data augmentation framework that mitigates shortcut reliance through paraphrasing, factual summarization, and sentiment normalization. SMF consistently enhances robustness across 16 benchmarks, forcing models to rely on deeper semantic understanding rather than shortcut cues.
How human error became a weapon against large language models
Alan Turing proposed a test for machine intelligence: could a computer convince a human it was human? Recently, a friend told me over coffee about some disheartening feedback she had received. "They said it was good," she said, "but that it read like it was written by AI." Knowing her, I understood immediately what had happened. Her credibility was being questioned not because her work was poor, but because it was too good - too clear, too fluent, too polished. The rapid acceleration of artificial intelligence tools is changing how we think about good writing.
Reddit's human content wins amid the AI flood
Reddit's human content wins amid the AI flood For Ines Tan there's one particular site she turns to again and again for advice - and that's Reddit. Tan, who works in communications, regularly jumps on the site for skincare advice, to view reactions to shows she watches, such as The Traitors, and for help planning her upcoming wedding in May. It's a very empathetic place, she says of Reddit. For my wedding, I've found help emotionally, logistically and inspiration-wise. Tan believes people are consulting the online discussion platform more as they're craving human interaction in the world of increasing AI slop.
The Most Powerful Politics Influencers Barely Post About Politics
New research shows that social media creators have enormous influence over their audiences' politics--especially those who don't normally share political content. Donald Trump's appearances on the podcasts of Joe Rogan and Theo Von, among others, were seen by many as a key part of securing his second term in office. But while Trump was speculating about alien life on Mars with Rogan, he had a team of acolytes appearing on dozens, if not hundreds, of much smaller niche podcasts hosted by right-wing content creators who typically don't talk about politics. This is how, just six days before the election, Kash Patel, the man now struggling to run the FBI, ended up appearing on the livestream, a fringe, QAnon-infused show hosted on a platform called Pilled. "The Deep State exists," Patel told the audience.
User Negotiations of Authenticity, Ownership, and Governance on AI-Generated Video Platforms: Evidence from Sora
Shen, Bohui, Bhatta, Shrikar, Ireebanije, Alex, Liu, Zexuan, Choudhry, Abhinav, Gumusel, Ece, Zhou, Kyrie Zhixuan
As AI-generated video platforms rapidly advance, ethical challenges such as copyright infringement emerge. This study examines how users make sense of AI-generated videos on OpenAI's Sora by conducting a qualitative content analysis of user comments. Through a thematic analysis, we identified four dynamics that characterize how users negotiate authenticity, authorship, and platform governance on Sora. First, users acted as critical evaluators of realism, assessing micro-details such as lighting, shadows, fluid motion, and physics to judge whether AI-generated scenes could plausibly exist. Second, users increasingly shifted from passive viewers to active creators, expressing curiosity about prompts, techniques, and creative processes. Text prompts were perceived as intellectual property, generating concerns about plagiarism and remixing norms. Third, users reported blurred boundaries between real and synthetic media, worried about misinformation, and even questioned the authenticity of other commenters, suspecting bot-generated engagement. Fourth, users contested platform governance: some perceived moderation as inconsistent or opaque, while others shared tactics for evading prompt censorship through misspellings, alternative phrasing, emojis, or other languages. Despite this, many users also enforced ethical norms by discouraging the misuse of real people's images or disrespectful content. Together, these patterns highlighted how AI-mediated platforms complicate notions of reality, creativity, and rule-making in emerging digital ecosystems. Based on the findings, we discuss governance challenges in Sora and how user negotiations inform future platform governance.
SDQM: Synthetic Data Quality Metric for Object Detection Dataset Evaluation
Zenith, Ayush, Zumbrun, Arnold, Raut, Neel, Lin, Jing
The performance of machine learning models depends heavily on training data. The scarcity of large-scale, well-annotated datasets poses significant challenges in creating robust models. To address this, synthetic data generated through simulations and generative models has emerged as a promising solution, enhancing dataset diversity and improving the performance, reliability, and resilience of models. However, evaluating the quality of this generated data requires an effective metric. This paper introduces the Synthetic Dataset Quality Metric (SDQM) to assess data quality for object detection tasks without requiring model training to converge. This metric enables more efficient generation and selection of synthetic datasets, addressing a key challenge in resource-constrained object detection tasks. In our experiments, SDQM demonstrated a strong correlation with the mean Average Precision (mAP) scores of YOLOv11, a leading object detection model, while previous metrics only exhibited moderate or weak correlations. Additionally, it provides actionable insights for improving dataset quality, minimizing the need for costly iterative training. This scalable and efficient metric sets a new standard for evaluating synthetic data.
There's Never Been a Worse Time to Be Authentic at Work
There's Never Been a Worse Time to Be Authentic at Work Workers have been told to bring themselves to work, only to be disappointed time and time again, argues author Jodi-Ann Burey in her new book. Jodi-Ann Burey was only two weeks into her new role as an inclusion marketing manager for an outdoor retail company when she was accused of having a "race agenda." Burey, who is Black, was no stranger to workplace hypocrisy; as she sees it, the office is a petri dish where the knotty dynamics of society are concentrated. At the time of the accusation in February 2020, however, all she could do was laugh. "I was like, you knew who I was before you poached me. This is exactly what you wanted me to do," she says over Zoom.
A Dynamic Knowledge Update-Driven Model with Large Language Models for Fake News Detection
Jin, Di, Yang, Jun, Wang, Xiaobao, Zhang, Junwei, Li, Shuqi, He, Dongxiao
As the Internet and social media evolve rapidly, distinguishing credible news from a vast amount of complex information poses a significant challenge. Due to the suddenness and instability of news events, the authenticity labels of news can potentially shift as events develop, making it crucial for fake news detection to obtain the latest event updates. Existing methods employ retrieval-augmented generation to fill knowledge gaps, but they suffer from issues such as insufficient credibility of retrieved content and interference from noisy information. We propose a DYnamic kNowledge updAte-driven MOdel for fake news detection (DYNAMO), which leverages knowledge graphs to achieve continuous updating of new knowledge and integrates with large language models to fulfill dual functions: news authenticity detection and verification of new knowledge correctness, solving the two key problems of ensuring the authenticity of new knowledge and deeply mining news semantics. Specifically, we first construct a news-domain-specific knowledge graph. Then, we use Monte Carlo Tree Search to decompose complex news and verify them step by step. Finally, we extract and update new knowledge from verified real news texts and reasoning paths. Experimental results demonstrate that DYNAMO achieves the best performance on two real-world datasets.
SCAR: A Characterization Scheme for Multi-Modal Dataset
Su, Ri, Chen, Zhao, Cao, Caleb Chen, Tang, Nan, Chen, Lei
Foundation models exhibit remarkable generalization across diverse tasks, largely driven by the characteristics of their training data. Recent data-centric methods like pruning and compression aim to optimize training but offer limited theoretical insight into how data properties affect generalization, especially the data characteristics in sample scaling. Traditional perspectives further constrain progress by focusing predominantly on data quantity and training efficiency, often overlooking structural aspects of data quality. In this study, we introduce SCAR, a principled scheme for characterizing the intrinsic structural properties of datasets across four key measures: Scale, Coverage, Authenticity, and Richness. Unlike prior data-centric measures, SCAR captures stable characteristics that remain invariant under dataset scaling, providing a robust and general foundation for data understanding. Leveraging these structural properties, we introduce Foundation Data-a minimal subset that preserves the generalization behavior of the full dataset without requiring model-specific retraining. We model single-modality tasks as step functions and estimate the distribution of the foundation data size to capture step-wise generalization bias across modalities in the target multi-modal dataset. Finally, we develop a SCAR-guided data completion strategy based on this generalization bias, which enables efficient, modality-aware expansion of modality-specific characteristics in multimodal datasets. Experiments across diverse multi-modal datasets and model architectures validate the effectiveness of SCAR in predicting data utility and guiding data acquisition. Code is available at https://github.com/McAloma/SCAR.
Navigating the New Landscape: A Conceptual Model for Project-Based Assessment (PBA) in the Age of GenAI
Kadel, Rajan, Shailendra, Samar, Saxena, Urvashi Rahul
The rapid integration of Generative Artificial Intelligence (GenAI) into higher education presents both opportunities and challenges for assessment design, particularly within Project-Based Assessment (PBA) contexts. Traditional assessment methods often emphasise the final product in the PBA, which can now be significantly influenced or created by GenAI tools, raising concerns regarding product authenticity, academic integrity, and learning validation. This paper advocates for a reimagined assessment model for Project-Based Learning (PBL) or a capstone project that prioritises process-oriented evaluation, multi-modal and multifaceted assessment design, and ethical engagement with GenAI to enable higher-order thinking. The model also emphasises the use of (GenAI-assisted) personalised feedback by a supervisor as an observance of the learning process during the project lifecycle. A use case scenario is provided to illustrate the application of the model in a capstone project setting. The paper concludes with recommendations for educators and curriculum designers to ensure that assessment practices remain robust, learner-centric, and integrity-driven in the evolving landscape of GenAI.