AITopics | Media

Collaborating Authors

Media

In the Eye of the Beholder: Robust Prediction with Causal User Modeling

Neural Information Processing SystemsAug-15-2025, 03:30:26 GMT

In this paper, we propose a learning framework for relevance prediction that is robust to changes in the data distribution.

graph, information, regularization, (15 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Middle East > Israel (0.04)

Genre: Research Report > New Finding (0.94)

Industry: Media > Film (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(3 more...)

Add feedback

BiT: Robustly Binarized Multi-distilled Transformer

Neural Information Processing SystemsAug-15-2025, 03:07:39 GMT

Inspired by the learnable bias proposed in ReActNet (Liu et al., 2020), we further propose elastic In contrast to Bi-Attention proposed in BiBERT (Qin et al., 2021) that removes We conduct meticulous experiments to compare these choices. The binary convolution between the weights and activations that are both binarized to {-1, 1} (i.e. The GLUE benchmark (Wang et al., 2019) includes the following datasets: MNLI Multi-Genre Natural Language Inference is an entailment classification task (Williams et al., QQP Quora Question Pairs is a paraphrase detection task. QNLI Question Natural Language Inference (Wang et al., 2019) is a binary classification task STS-B The Semantic Textual Similarity Benchmark is a sentence pair classification task. The sentence pairs are sourced from online news sources (Dolan & Brockett, 2005).

activation, distillation, robustly binarized multi-distilled transformer, (13 more...)

Neural Information Processing Systems

Industry: Media (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.30)

Add feedback

Echoes of Automation: The Increasing Use of LLMs in Newsmaking

Ansari, Abolfazl, Zhang, Delvin Ce, Tripto, Nafis Irtiza, Lee, Dongwon

arXiv.org Artificial IntelligenceAug-15-2025

The rapid rise of Generative AI (GenAI), particularly LLMs, poses concerns for journalistic integrity and authorship. This study examines AI-generated content across over 40,000 news articles from major, local, and college news media, in various media formats. Using three advanced AI-text detectors (e.g., Binoculars, Fast-Detect GPT, and GPTZero), we find substantial increase of GenAI use in recent years, especially in local and college news. Sentence-level analysis reveals LLMs are often used in the introduction of news, while conclusions usually written manually. Linguistic analysis shows GenAI boosts word richness and readability but lowers formality, leading to more uniform writing styles, particularly in local media.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2508.06445

Country:

North America > United States > Pennsylvania (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Switzerland > Geneva > Geneva (0.14)
Europe > Austria > Vienna (0.14)

Genre: Research Report > Experimental Study > Negative Result (0.47)

Industry: Media > News (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.35)

Add feedback

ToonComposer: Streamlining Cartoon Production with Generative Post-Keyframing

Li, Lingen, Wang, Guangzhi, Zhang, Zhaoyang, Li, Yaowei, Li, Xiaoyu, Dou, Qi, Gu, Jinwei, Xue, Tianfan, Shan, Ying

arXiv.org Artificial IntelligenceAug-15-2025

Traditional cartoon and anime production involves keyframing, inbetweening, and colorization stages, which require intensive manual effort. Despite recent advances in AI, existing methods often handle these stages separately, leading to error accumulation and artifacts. For instance, inbetweening approaches struggle with large motions, while colorization methods require dense per-frame sketches. To address this, we introduce ToonComposer, a generative model that unifies inbetweening and colorization into a single post-keyframing stage. ToonComposer employs a sparse sketch injection mechanism to provide precise control using keyframe sketches. Additionally, it uses a cartoon adaptation method with the spatial low-rank adapter to tailor a modern video foundation model to the cartoon domain while keeping its temporal prior intact. Requiring as few as a single sketch and a colored reference frame, ToonComposer excels with sparse inputs, while also supporting multiple sketches at any temporal location for more precise motion control. This dual capability reduces manual workload and improves flexibility, empowering artists in real-world scenarios. To evaluate our model, we further created PKBench, a benchmark featuring human-drawn sketches that simulate real-world use cases. Our evaluation demonstrates that ToonComposer outperforms existing methods in visual quality, motion consistency, and production efficiency, offering a superior and more flexible solution for AI-assisted cartoon production.

machine learning, natural language, sketch, (18 more...)

arXiv.org Artificial Intelligence

2508.10881

Genre: Research Report (1.00)

Industry:

Media (0.68)
Leisure & Entertainment (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)
Information Technology > Artificial Intelligence > Vision (0.68)
(2 more...)

Add feedback

SSRL: Self-Search Reinforcement Learning

Fan, Yuchen, Zhang, Kaiyan, Zhou, Heng, Zuo, Yuxin, Chen, Yanxu, Fu, Yu, Long, Xinwei, Zhu, Xuekai, Jiang, Che, Zhang, Yuchen, Kang, Li, Chen, Gang, Huang, Cheng, He, Zhizhou, Wang, Bingning, Bai, Lei, Ding, Ning, Zhou, Bowen

arXiv.org Artificial IntelligenceAug-15-2025

We investigate the potential of large language models (LLMs) to serve as efficient simulators for agentic search tasks in reinforcement learning (RL), thereby reducing dependence on costly interactions with external search engines. To this end, we first quantify the intrinsic search capability of LLMs via structured prompting and repeated sampling, which we term Self-Search. Our results reveal that LLMs exhibit strong scaling behavior with respect to the inference budget, achieving high pass@k on question-answering benchmarks, including the challenging BrowseComp task. Building on these observations, we introduce Self-Search RL (SSRL), which enhances LLMs' Self-Search capability through format-based and rule-based rewards. SSRL enables models to iteratively refine their knowledge utilization internally, without requiring access to external tools. Empirical evaluations demonstrate that SSRL-trained policy models provide a cost-effective and stable environment for search-driven RL training, reducing reliance on external search engines and facilitating robust sim-to-real transfer. We draw the following conclusions: 1) LLMs possess world knowledge that can be effectively elicited to achieve high performance; 2) SSRL demonstrates the potential of leveraging internal knowledge to reduce hallucination; 3) SSRL-trained models integrate seamlessly with external search engines without additional effort. Our findings highlight the potential of LLMs to support more scalable RL agent training.

information, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2508.10874

Country:

Europe (1.00)
Asia (0.67)
North America > United States (0.46)

Genre: Research Report > New Finding (1.00)

Industry:

Media > Film (1.00)
Leisure & Entertainment > Sports > Football (1.00)
Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Add feedback

Modeling Human Responses to Multimodal AI Content

Shen, Zhiqi, Fan, Shaojing, Xu, Danni, Sim, Terence, Kankanhalli, Mohan

arXiv.org Artificial IntelligenceAug-15-2025

As AI-generated content becomes widespread, so does the risk of misinformation. While prior research has primarily focused on identifying whether content is authentic, much less is known about how such content influences human perception and behavior. In domains like trading or the stock market, predicting how people react (e.g., whether a news post will go viral), can be more critical than verifying its factual accuracy. To address this, we take a human-centered approach and introduce the MhAIM Dataset, which contains 154,552 online posts (111,153 of them AI-generated), enabling large-scale analysis of how people respond to AI-generated content. Our human study reveals that people are better at identifying AI content when posts include both text and visuals, particularly when inconsistencies exist between the two. We propose three new metrics: trustworthiness, impact, and openness, to quantify how users judge and engage with online content. We present T-Lens, an LLM-based agent system designed to answer user queries by incorporating predicted human responses to multimodal information. At its core is HR-MCP (Human Response Model Context Protocol), built on the standardized Model Context Protocol (MCP), enabling seamless integration with any LLM. This integration allows T-Lens to better align with human reactions, enhancing both interpretability and interaction capabilities. Our work provides empirical insights and practical tools to equip LLMs with human-awareness capabilities. By highlighting the complex interplay among AI, human cognition, and information reception, our findings suggest actionable strategies for mitigating the risks of AI-driven misinformation.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2508.10769

Country:

Europe (0.46)
Asia (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Media > News (1.00)
Information Technology > Security & Privacy (0.94)
Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
(2 more...)

Add feedback

DiFaR: Enhancing Multimodal Misinformation Detection with Diverse, Factual, and Relevant Rationales

Wan, Herun, Wu, Jiaying, Luo, Minnan, Kong, Xiangzheng, Ma, Zihan, Zeng, Zhi

arXiv.org Artificial IntelligenceAug-15-2025

Generating textual rationales from large vision-language models (LVLMs) to support trainable multimodal misinformation detectors has emerged as a promising paradigm. However, its effectiveness is fundamentally limited by three core challenges: (i) insufficient diversity in generated rationales, (ii) factual inaccuracies due to hallucinations, and (iii) irrelevant or conflicting content that introduces noise. We introduce DiFaR, a detector-agnostic framework that produces diverse, factual, and relevant rationales to enhance misinformation detection. DiFaR employs five chain-of-thought prompts to elicit varied reasoning traces from LVLMs and incorporates a lightweight post-hoc filtering module to select rationale sentences based on sentence-level factuality and relevance scores. Extensive experiments on four popular benchmarks demonstrate that DiFaR outperforms four baseline categories by up to 5.9% and boosts existing detectors by as much as 8.7%. Both automatic metrics and human evaluations confirm that DiFaR significantly improves rationale quality across all three dimensions.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2508.10444

Country: Africa (0.46)

Genre: Research Report > New Finding (0.46)

Industry: Media > News (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

MCP2OSC: Parametric Control by Natural Language

Fan, Yuan-Yi

arXiv.org Artificial IntelligenceAug-15-2025

Text prompts enable intuitive content creation but may fall short in achieving high precision for intricate tasks; knob or slider controls offer precise adjustments at the cost of increased complexity. To address the gap between knobs and prompts, a new MCP (Model Context Protocol) server and a unique set of prompt design criteria are presented to enable exploring parametric OSC (OpenSoundControl) control by natural language prompts. Demonstrated by 14 practical QA examples with best practices and the generalized prompt templates, this study finds Claude integrated with the MCP2OSC server effective in generating OSC messages by natural language, interpreting, searching, and visualizing OSC messages, validating and debugging OSC messages, and managing OSC address patterns. MCP2OSC enhances human-machine collaboration by leveraging LLM (Large Language Model) to handle intricate OSC development tasks, and by empowering human creativity with an intuitive language interface featuring flexible precision controls: a prompt-based OSC tool. This study provides a novel perspective on the creative MCP application at the network protocol level by utilizing LLM's strength in directly processing and generating human-readable OSC messages. The results suggest its potential for a LLM-based universal control mechanism for multimedia devices.

large language model, natural language, osc message, (14 more...)

arXiv.org Artificial Intelligence

2508.10414

Country: North America > United States > California > Los Angeles County > Los Angeles (0.14)

Genre:

Research Report > New Finding (0.48)
Research Report > Experimental Study (0.48)

Industry:

Media > Music (0.47)
Leisure & Entertainment (0.47)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Large Language Models for Summarizing Czech Historical Documents and Beyond

Tran, Václav, Šmíd, Jakub, Martínek, Jiří, Lenc, Ladislav, Král, Pavel

arXiv.org Artificial IntelligenceAug-15-2025

Text summarization is the task of shortening a larger body of text into a concise version while retaining its essential meaning and key information. While summarization has been significantly explored in English and other high-resource languages, Czech text summarization, particularly for historical documents, remains underexplored due to linguistic complexities and a scarcity of annotated datasets. Large language models such as Mistral and mT5 have demonstrated excellent results on many natural language processing tasks and languages. Therefore, we employ these models for Czech summarization, resulting in two key contributions: (1) achieving new state-of-the-art results on the modern Czech summarization dataset SumeCzech using these advanced models, and (2) introducing a novel dataset called Posel od Čerchova for summarization of historical Czech documents with baseline results. Together, these contributions provide a great potential for advancing Czech text summarization and open new avenues for research in Czech historical text processing.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.5220/0013374100003890

2508.10368

Country:

Europe (0.94)
North America > United States (0.28)

Genre: Research Report (0.82)

Industry: Media > News (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

Facilitating Longitudinal Interaction Studies of AI Systems

Long, Tao, Wang, Sitong, Fabre, Émilie, Wang, Tony, Sathya, Anup, Wu, Jason, Petridis, Savvas, Li, Dingzeyu, Chakrabarty, Tuhin, Jiang, Yue, Li, Jingyi, Tseng, Tiffany, Nakagaki, Ken, Yang, Qian, Martelaro, Nikolas, Nickerson, Jeffrey V., Chilton, Lydia B.

arXiv.org Artificial IntelligenceAug-15-2025

UIST researchers develop tools to address user challenges. However, user interactions with AI evolve over time through learning, adaptation, and repurposing, making one time evaluations insufficient. Capturing these dynamics requires longer-term studies, but challenges in deployment, evaluation design, and data collection have made such longitudinal research difficult to implement. Our workshop aims to tackle these challenges and prepare researchers with practical strategies for longitudinal studies. The workshop includes a keynote, panel discussions, and interactive breakout groups for discussion and hands-on protocol design and tool prototyping sessions. We seek to foster a community around longitudinal system research and promote it as a more embraced method for designing, building, and evaluating UIST tools.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2508.10252

Country:

Europe (1.00)
North America > United States > California (0.46)
North America > United States > New York > New York County > New York City (0.19)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.15)

Genre: Research Report (1.00)

Industry:

Information Technology (0.93)
Education (0.93)
Health & Medicine (0.70)
Media > News (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.94)
Information Technology > Human Computer Interaction > Interfaces (0.68)
(2 more...)

Add feedback