Goto

Collaborating Authors

 Media


Unfair Learning: GenAI Exceptionalism and Copyright Law

arXiv.org Artificial Intelligence

It examines fair use legal arguments and eight distinct substantive arguments, contending that every legal and substantive argument favoring fair use for GenAI applies equally, if not more so, to humans. Therefore, granting GenAI exceptional privileges in this domain is legally and logically inco nsistent with withholding broad fair use exemptions from individual humans.


An AI Image Generator's Exposed Database Reveals What People Really Used It For

WIRED

Tens of thousands of explicit AI-generated images, including AI-generated child sexual abuse material, were left open and accessible to anyone on the internet, according to new research seen by WIRED. An open database belonging to an AI image-generation firm contained more than 95,000 records, including some prompt data and images of celebrities such as Ariana Grande, the Kardashians, and Beyoncé de-aged to look like children. The exposed database, which was discovered by security researcher Jeremiah Fowler, who shared details of the leak with WIRED, is linked to South Korea–based website GenNomis. The website and its parent company, AI-Nomis, hosted a number of image generation and chatbot tools for people to use. More than 45 GB of data, mostly made up of AI images, was left in the open.


AI was enemy No. 1 during Hollywood strikes. Now it's in Oscar-winning films

BBC News

AI may be a dirty word in Hollywood, but Mr Mooser says their version of the technology is "clean." "Artists should be at the table," he says, adding that it's better to build the tool for filmmakers rather than get "rolled over by big tech companies". Artificial Intelligence has long been depicted as a villain in Hollywood. In "The Terminator," AI used by the US military decides it must destroy everyone on Earth. But it's AI's creators, and not the technology itself, that has received the brunt of real-life criticism.


BEATS: Bias Evaluation and Assessment Test Suite for Large Language Models

arXiv.org Artificial Intelligence

In this research, we introduce BEA TS, a novel framework for evaluating Bias, Ethics, Fairness, and Factuality in Large Language Models (LLMs). Building upon the BEA TS framework, we present a bias benchmark for LLMs that measure performance across 29 distinct metrics. These metrics span a broad range of characteristics, including demographic, cognitive, and social biases, as well as measures of ethical reasoning, group fairness, and factuality related misinformation risk. These metrics enable a quantitative assessment of the extent to which LLM generated responses may perpetuate societal prejudices that reinforce or expand systemic inequities. To achieve a high score on this benchmark a LLM must show very equitable behavior in their responses, making it a rigorous standard for responsible AI evaluation. Empirical results based on data from our experiment show that, 37.65% of outputs generated by industry leading models contained some form of bias, highlighting a substantial risk of using these models in critical decision making systems. BEA TS framework and benchmark offer a scalable and statistically rigorous methodology to benchmark LLMs, diagnose factors driving biases, and develop mitigation strategies. With the BEA TS framework, our goal is to help the development of more socially responsible and ethically aligned AI models.


Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation

arXiv.org Artificial Intelligence

To address the bottleneck of accurate user intent interpretation within the current video generation community, we present Any2Caption, a novel framework for controllable video generation under any condition. The key idea is to decouple various condition interpretation steps from the video synthesis step. By leveraging modern multimodal large language models (MLLMs), Any2Caption interprets diverse inputs--text, images, videos, and specialized cues such as region, motion, and camera poses--into dense, structured captions that offer backbone video generators with better guidance. We also introduce Any2CapIns, a large-scale dataset with 337K instances and 407K conditions for any-condition-to-caption instruction tuning. Comprehensive evaluations demonstrate significant improvements of our system in controllability and video quality across various aspects of existing video generation models. Project Page: https://sqwu.top/Any2Cap/


Better wit than wealth: Dynamic Parametric Retrieval Augmented Generation for Test-time Knowledge Enhancement

arXiv.org Artificial Intelligence

Retrieval-augmented generation (RAG) enhances large language models (LLMs) by retrieving relevant documents from external sources and incorporating them into the context. While it improves reliability by providing factual texts, it significantly increases inference costs as context length grows and introduces challenging issue of RAG hallucination, primarily caused by the lack of corresponding parametric knowledge in LLMs. An efficient solution is to enhance the knowledge of LLMs at test-time. Parametric RAG (PRAG) addresses this by embedding document into LLMs parameters to perform test-time knowledge enhancement, effectively reducing inference costs through offline training. However, its high training and storage costs, along with limited generalization ability, significantly restrict its practical adoption. To address these challenges, we propose Dynamic Parametric RAG (DyPRAG), a novel framework that leverages a lightweight parameter translator model to efficiently convert documents into parametric knowledge. DyPRAG not only reduces inference, training, and storage costs but also dynamically generates parametric knowledge, seamlessly enhancing the knowledge of LLMs and resolving knowledge conflicts in a plug-and-play manner at test-time. Extensive experiments on multiple datasets demonstrate the effectiveness and generalization capabilities of DyPRAG, offering a powerful and practical RAG paradigm which enables superior knowledge fusion and mitigates RAG hallucination in real-world applications. Our code is available at https://github.com/Trae1ounG/DyPRAG.


Training-Free Text-Guided Image Editing with Visual Autoregressive Model

arXiv.org Artificial Intelligence

Text-guided image editing is an essential task that enables users to modify images through natural language descriptions. Recent advances in diffusion models and rectified flows have significantly improved editing quality, primarily relying on inversion techniques to extract structured noise from input images. However, inaccuracies in inversion can propagate errors, leading to unintended modifications and compromising fidelity. Moreover, even with perfect inversion, the entanglement between textual prompts and image features often results in global changes when only local edits are intended. To address these challenges, we propose a novel text-guided image editing framework based on VAR (Visual AutoRegressive modeling), which eliminates the need for explicit inversion while ensuring precise and controlled modifications. Our method introduces a caching mechanism that stores token indices and probability distributions from the original image, capturing the relationship between the source prompt and the image. Using this cache, we design an adaptive fine-grained masking strategy that dynamically identifies and constrains modifications to relevant regions, preventing unintended changes. A token reassembling approach further refines the editing process, enhancing diversity, fidelity, and control. Our framework operates in a training-free manner and achieves high-fidelity editing with faster inference speeds, processing a 1K resolution image in as fast as 1.2 seconds. Extensive experiments demonstrate that our method achieves performance comparable to, or even surpassing, existing diffusion- and rectified flow-based approaches in both quantitative metrics and visual quality. The code will be released.


AI2Agent: An End-to-End Framework for Deploying AI Projects as Autonomous Agents

arXiv.org Artificial Intelligence

As AI technology advances, it is driving innovation across industries, increasing the demand for scalable AI project deployment. However, deployment remains a critical challenge due to complex environment configurations, dependency conflicts, cross-platform adaptation, and debugging difficulties, which hinder automation and adoption. This paper introduces AI2Agent, an end-to-end framework that automates AI project deployment through guideline-driven execution, self-adaptive debugging, and case \& solution accumulation. AI2Agent dynamically analyzes deployment challenges, learns from past cases, and iteratively refines its approach, significantly reducing human intervention. To evaluate its effectiveness, we conducted experiments on 30 AI deployment cases, covering TTS, text-to-image generation, image editing, and other AI applications. Results show that AI2Agent significantly reduces deployment time and improves success rates. The code and demo video are now publicly accessible.


A Minecraft Movie and Stranger Things star's new album: What's coming up this week

BBC News

The attention put a rocket under his music career. While his first two albums were DIY affairs, recorded in a couple of days and self-released, his latest, The Crux, was created in New York's fabled Electric Lady studios. Released on Friday, it's packed full of off-kilter lyrics and squiggly synth lines that burrow into your brain. The first two singles, Delete Ya and Basic Being Basic have already been radio hits, and the rest of the album pulls on influences as diverse as Electric Light Orchestra, New Order, Cake, Hall & Oates and Bruce Springsteen (coincidentally, all bands that would work perfectly on the Stranger Things soundtrack). There are a couple of knockouts – including the crunchy garage rock of Gap Toothed Smile, and the choppy New Wave anthem Link – but the point of the album is its diversity.


Scientists say time travel IS possible - and people have already done it

Daily Mail - Science & tech

From H. G. Wells's The Time Machine to Christopher Nolan's Interstellar, the possibility of travelling through time has fascinated people for centuries. But, although it sounds like pure science fiction, physicists now believe that time travel really is possible. In fact, scientists say that people have already done it. But, before you start to plan your trip to ancient Rome, the experts caution that real time travel is nothing like what you see in the movies. It might seem obvious, but here on Earth, we all move through time at a speed of one second per second.