Goto

Collaborating Authors

 Generative AI


Fantastic Copyrighted Beasts and How (Not) to Generate Them

arXiv.org Artificial Intelligence

Recent studies show that image and video generation models can be prompted to reproduce copyrighted content from their training data, raising serious legal concerns around copyright infringement. Copyrighted characters, in particular, pose a difficult challenge for image generation services, with at least one lawsuit already awarding damages based on the generation of these characters. Yet, little research has empirically examined this issue. We conduct a systematic evaluation to fill this gap. First, we build CopyCat, an evaluation suite consisting of diverse copyrighted characters and a novel evaluation pipeline. Our evaluation considers both the detection of similarity to copyrighted characters and generated image's consistency with user input. Our evaluation systematically shows that both image and video generation models can still generate characters even if characters' names are not explicitly mentioned in the prompt, sometimes with only two generic keywords (e.g., prompting with "videogame, plumber" consistently generates Nintendo's Mario character). We then introduce techniques to semi-automatically identify such keywords or descriptions that trigger character generation. Using our evaluation suite, we study runtime mitigation strategies, including both existing methods and new strategies we propose. Our findings reveal that commonly employed strategies, such as prompt rewriting in the DALL-E system, are not sufficient as standalone guardrails. These strategies must be coupled with other approaches, like negative prompting, to effectively reduce the unintended generation of copyrighted characters. Our work provides empirical grounding to the discussion of copyright mitigation strategies and offers actionable insights for model deployers actively implementing them.


SPL: A Socratic Playground for Learning Powered by Large Language Model

arXiv.org Artificial Intelligence

Dialogue-based Intelligent Tutoring Systems (ITSs) have significantly advanced adaptive and personalized learning by automating sophisticated human tutoring strategies within interactive dialogues. However, replicating the nuanced patterns of expert human communication remains a challenge in Natural Language Processing (NLP). Recent advancements in NLP, particularly Large Language Models (LLMs) such as OpenAI's GPT-4, offer promising solutions by providing human-like and context-aware responses based on extensive pre-trained knowledge. Motivated by the effectiveness of LLMs in various educational tasks (e.g., content creation and summarization, problem-solving, and automated feedback provision), our study introduces the Socratic Playground for Learning (SPL), a dialogue-based ITS powered by the GPT-4 model, which employs the Socratic teaching method to foster critical thinking among learners. Through extensive prompt engineering, SPL can generate specific learning scenarios and facilitates efficient multi-turn tutoring dialogues. The SPL system aims to enhance personalized and adaptive learning experiences tailored to individual needs, specifically focusing on improving critical thinking skills. Our pilot experimental results from essay writing tasks demonstrate SPL has the potential to improve tutoring interactions and further enhance dialogue-based ITS functionalities. Our study, exemplified by SPL, demonstrates how LLMs enhance dialogue-based ITSs and expand the accessibility and efficacy of educational technologies.


Former OpenAI Chief Scientist Announces New Safety-Focused Company

TIME - Tech

Ilya Sutskever, a co-founder and former chief scientist of OpenAI, announced on Wednesday that he's launching a new venture dubbed Safe Superintelligence Inc. Sutskever said on X that the new lab will focus solely on building a safe "superintelligence"--an industry term for a hypothetical system that's smarter than humans. Sutskever is joined at Safe SuperIntelligence Inc. by co-founders Daniel Gross, an investor and engineer who worked on AI at Apple till 2017, and Daniel Levy, another former OpenAI employee. The new American-based firm will have offices in Palo Alto, Calif., and Tel Aviv, according to a description Sutskever shared. I am starting a new company: https://t.co/BG3K3SI3A1 Sutskever was one of OpenAI's founding members, and was chief scientist during the company's meteoric rise following the release of ChatGPT.


Adobe Says It Won't Train AI Using Artists' Work. Creatives Aren't Convinced

WIRED

When users first found out about Adobe's new terms of service (which were quietly updated in February), there was an uproar. Adobe told users it could access their content "through both automated and manual methods" and use "techniques such as machine learning in order to improve [Adobe's] Services and Software." Many understood the update as the company forcing users to grant unlimited access to their work, for purposes of training Adobe's generative AI: Firefly. Late on Tuesday, Adobe issued a clarification: In an updated version of its terms of service agreement, it pledged not to train AI on its user content stored locally or in the cloud and gave users the option to opt-out of content analytics. Caught in the crossfire of intellectual property lawsuits, the ambiguous language used to previously update the terms shed light on a climate of acute skepticism among artists, many of whom over rely on Adobe for their work.


Scientists Develop New Algorithm to Spot AI 'Hallucinations'

TIME - Tech

An enduring problem with today's generative artificial intelligence (AI) tools, like ChatGPT, is that they often confidently assert false information. Computer scientists call this behavior "hallucination," and it's a key barrier to AI's usefulness. Hallucinations have led to some embarrassing public slip-ups. In February, AirCanada was forced by a tribunal to honor a discount that its customer-support chatbot had mistakenly offered to a passenger. In May, Google was forced to make changes to its new "AI overviews" search feature, after the bot told some users that it was safe to eat rocks. And last June, two lawyers were fined 5,000 by a U.S. judge after one of them admitted he had used ChatGPT to help write a court filing.


The Download: video-generating AI, and Meta's voice cloning watermarks

MIT Technology Review

You may not be familiar with Kuaishou, but this Chinese company just hit a major milestone: It's released the first ever text-to-video generative AI model that's freely available for the public to test. The short-video platform, which has over 600 million active users, announced the new tool, called Kling, on June 6. Like OpenAI's Sora model, Kling is able to generate videos up to two minutes long from prompts. But unlike Sora, which still remains inaccessible to the public four months after OpenAI debuted it, Kling has already started letting people try the model themselves. Zeyi Yang, our China reporter, has been putting it through its paces.


California lawmakers are trying to regulate AI before it's too late. Here's how

Los Angeles Times

For four years, Jacob Hilton worked for one of the most influential startups in the Bay Area -- OpenAI. His research helped test and improve the truthfulness of AI models such as ChatGPT. He believes artificial intelligence can benefit society, but he also recognizes the serious risks if the technology is left unchecked. Hilton was among 13 current and former OpenAI and Google employees who this month signed an open letter that called for more whistleblower protections, citing broad confidentiality agreements as problematic. "The basic situation is that employees, the people closest to the technology, they're also the ones with the most to lose from being retaliated against for speaking up," says Hilton, 33, now a researcher at the nonprofit Alignment Research Center, who lives in Berkeley.


Apple Is Bringing A.I. to Your Personal Life, Like It or Not

The New Yorker

Last week, Apple held its Worldwide Developers Conference, the annual event that is often used to showcase the company's most significant innovations. Much of the presentation this year was devoted to A.I., or, as the company is branding it, Apple Intelligence. Whereas Google and Microsoft have leaped headlong into A.I. with their Gemini and OpenAI products, respectively, Apple is so far taking a narrower approach. The A.I. model it is unveiling on iPhone hardware is relatively weak. A.I. models are measured on their number of "parameters," or the variables adjusted during the training process; while OpenAI's GPT-4 has more than one and a half trillion parameters, Apple's model has three billion.


I tested out a buzzy new text-to-video AI model from China

MIT Technology Review

The short-video platform, which has over 600 million active users, announced the new tool on June 6. Like OpenAI's Sora model, Kling is able to generate videos "up to two minutes long with a frame rate of 30fps and video resolution up to 1080p," the company says on its website. But unlike Sora, which still remains inaccessible to the public four months after OpenAI trialed it, Kling soon started letting people try the model themselves. I got access to it after downloading Kuaishou's video-editing tool, signing up with a Chinese number, getting on a waitlist, and filling out an additional form through Kuaishou's user feedback groups. The model can't process prompts written entirely in English, but you can get around that by either translating the phrase you want to use into Chinese or including one or two Chinese words.


Combinatorial Reasoning: Selecting Reasons in Generative AI Pipelines via Combinatorial Optimization

arXiv.org Artificial Intelligence

Recent Large Language Models (LLMs) have demonstrated impressive capabilities at tasks that require human intelligence and are a significant step towards human-like artificial intelligence (AI). Yet the performance of LLMs at reasoning tasks have been subpar and the reasoning capability of LLMs is a matter of significant debate. While it has been shown that the choice of the prompting technique to the LLM can alter its performance on a multitude of tasks, including reasoning, the best performing techniques require human-made prompts with the knowledge of the tasks at hand. We introduce a framework for what we call Combinatorial Reasoning (CR), a fully-automated prompting method, where reasons are sampled from an LLM pipeline and mapped into a Quadratic Unconstrained Binary Optimization (QUBO) problem. The framework investigates whether QUBO solutions can be profitably used to select a useful subset of the reasons to construct a Chain-of-Thought style prompt. We explore the acceleration of CR with specialized solvers. We also investigate the performance of simpler zero-shot strategies such as linear majority rule or random selection of reasons. Our preliminary study indicates that coupling a combinatorial solver to generative AI pipelines is an interesting avenue for AI reasoning and elucidates design principles for future CR methods.