Goto

Collaborating Authors

 Generative AI


A Tale of Tails: Model Collapse as a Change of Scaling Laws

arXiv.org Artificial Intelligence

As AI model size grows, neural scaling laws have become a crucial tool to predict the improvements of large models when increasing capacity and the size of original (human or natural) training data. Yet, the widespread use of popular models means that the ecosystem of online data and text will co-evolve to progressively contain increased amounts of synthesized data. In this paper we ask: How will the scaling laws change in the inevitable regime where synthetic data makes its way into the training corpus? Will future models, still improve, or be doomed to degenerate up to total (model) collapse? We develop a theoretical framework of model collapse through the lens of scaling laws. We discover a wide range of decay phenomena, analyzing loss of scaling, shifted scaling with number of generations, the ''un-learning" of skills, and grokking when mixing human and synthesized data. Our theory is validated by large-scale experiments with a transformer on an arithmetic task and text generation using the large language model Llama2.


Coordinated Disclosure for AI: Beyond Security Vulnerabilities

arXiv.org Artificial Intelligence

This legal action ignited a heated debate, contributing to a growing series of lawsuits against AI providers [9-11, 54]. This incident underscores the inadequacy of current AI harm reporting mechanisms, leaving small harmed parties with limited recourse unless backed by substantial legal support or media awareness, despite the recognized potential for improving AI systems by exposing issues [78]. Current AI accountability initiatives primarily rely on periodic audits, emphasizing repetitive assessments but lacking a structured reporting framework for user-identified issues post-deployment. This audit-centric paradigm is reflected in influential policies such as the U.S. Executive Order on AI [93], the EU's draft AI Act [43], and New York City's Local Law 144[69]. However, this approach falls short when compared to the more comprehensive Coordinated Vulnerability Disclosure(CVD) processes standard in software security. Coordinated Vulnerability Disclosure (CVD) plays a crucial role as a mechanism for independent researchers to report newly identified vulnerabilities to affected vendors and the public [58]. This process enables transparent remediation before potential exploitation by malicious actors and has become a vital practice enshrined in government regulations and industry standards. Notably, the FDA mandates the implementation of CVD programs for medical device companies to enhance cybersecurity[96]. While CVD has demonstrated effectiveness in traditional software security, its direct application to machine learning (ML) systems faces unique challenges.


Who makes money when AI reads the internet for us?

Engadget

Last week, The Browser Company, a startup that makes the Arc web browser, released a slick new iPhone app called Arc Search. Instead of displaying links, its brand new "Browse for Me" feature reads the first handful of pages and summarizes them into a single, custom-built, Arc-formatted web page using large language models from OpenAI and others. If a user does click through to any of the actual pages, Arc Search blocks ads, cookies and trackers by default. Arc's efforts to reimagine web browsing have received near-universal acclaim. But over the last few days, "Browse for Me" earned The Browser Company its first online backlash.


The Download: how to improve pulse oximeters, and OpenAI's chip plans

MIT Technology Review

Visit any health-care facility, and one of the first things they'll do is clip a pulse oximeter to your finger. These devices, which track heart rate and blood oxygen, offer vital information about a person's health. For people with dark skin, pulse oximeters can overestimate just how much oxygen their blood is carrying. That means that a person with dangerously low oxygen levels might seem, according to the pulse oximeter, fine. The US Food and Drug Administration is still trying to figure out what to do about this problem. Last week, an FDA advisory committee met to mull over better ways to evaluate the performance of these devices in people with a variety of skin tones.


OpenAI's Sam Altman seeking trillions to fund chips for AI, report says

Al Jazeera

OpenAI CEO Sam Altman is seeking to raise trillions of dollars from investors, including the United Arab Emirates government, to boost the world's capacity to produce advanced chips and power artificial intelligence, The Wall Street Journal has reported. Altman's "wildly ambitious tech initiative" could require raising as much as 7 trillion, the WSJ reported on Thursday, quoting people familiar with the matter. As part of his pitch to investors, Altman has proposed building dozens of chip foundries that would then be run by existing chip makers, such as Taiwan Semiconductor Manufacturing Company (TSMC), the Journal said. The plans aim to solve obstacles to OpenAI's growth, including a scarcity of chips that power AI models such as ChatGPT, according to the WSJ, which described the sums being sought as "outlandishly large by the standards of corporate fundraising". Altamn's plans have so far seen him hold meetings with senior UAE officials, TSMC executives, US Secretary of Commerce Gina Raimondo and SoftBank's chief executive Masayoshi Son, according to the report.


Scalable Interactive Machine Learning for Future Command and Control

arXiv.org Artificial Intelligence

Future warfare will require Command and Control (C2) personnel to make decisions at shrinking timescales in complex and potentially ill-defined situations. Given the need for robust decision-making processes and decision-support tools, integration of artificial and human intelligence holds the potential to revolutionize the C2 operations process to ensure adaptability and efficiency in rapidly changing operational environments. We propose to leverage recent promising breakthroughs in interactive machine learning, in which humans can cooperate with machine learning algorithms to guide machine learning algorithm behavior. This paper identifies several gaps in state-of-the-art science and technology that future work should address to extend these approaches to function in complex C2 contexts. In particular, we describe three research focus areas that together, aim to enable scalable interactive machine learning (SIML): 1) developing human-AI interaction algorithms to enable planning in complex, dynamic situations; 2) fostering resilient human-AI teams through optimizing roles, configurations, and trust; and 3) scaling algorithms and human-AI teams for flexibility across a range of potential contexts and situations.


"When He Feels Cold, He Goes to the Seahorse"-Blending Generative AI into Multimaterial Storymaking for Family Expressive Arts Therapy

arXiv.org Artificial Intelligence

Storymaking, as an integrative form of expressive arts therapy, is an effective means to foster family communication. Yet, the integration of generative AI as expressive materials in therapeutic storymaking remains underexplored. And there is a lack of HCI implications on how to support families and therapists in this context. Addressing this, our study involved five weeks of storymaking sessions with seven families guided by a professional therapist. In these sessions, the families used both traditional art-making materials and image-based generative AI to create and evolve their family stories. Via the rich empirical data and commentaries from four expert therapists, we contextualize how families creatively melded AI and traditional expressive materials to externalize their ideas and feelings. Through the lens of Expressive Therapies Continuum (ETC), we characterize the therapeutic implications of AI as expressive materials. Desirable interaction qualities to support children, parents, and therapists are distilled for future HCI research.


Trust the Process: Zero-Knowledge Machine Learning to Enhance Trust in Generative AI Interactions

arXiv.org Artificial Intelligence

Generative AI, exemplified by models like transformers, has opened up new possibilities in various domains but also raised concerns about fairness, transparency and reliability, especially in fields like medicine and law. This paper emphasizes the urgency of ensuring fairness and quality in these domains through generative AI. It explores using cryptographic techniques, particularly Zero-Knowledge Proofs (ZKPs), to address concerns regarding performance fairness and accuracy while protecting model privacy. Applying ZKPs to Machine Learning models, known as ZKML (Zero-Knowledge Machine Learning), enables independent validation of AI-generated content without revealing sensitive model information, promoting transparency and trust. ZKML enhances AI fairness by providing cryptographic audit trails for model predictions and ensuring uniform performance across users. We introduce snarkGPT, a practical ZKML implementation for transformers, to empower users to verify output accuracy and quality while preserving model privacy. We present a series of empirical results studying snarkGPT's scalability and performance to assess the feasibility and challenges of adopting a ZKML-powered approach to capture quality and performance fairness problems in generative AI models.


Human Aesthetic Preference-Based Large Text-to-Image Model Personalization: Kandinsky Generation as an Example

arXiv.org Artificial Intelligence

With the advancement of neural generative capabilities, the art community has actively embraced GenAI (generative artificial intelligence) for creating painterly content. Large text-to-image models can quickly generate aesthetically pleasing outcomes. However, the process can be non-deterministic and often involves tedious trial-and-error, as users struggle with formulating effective prompts to achieve their desired results. This paper introduces a prompting-free generative approach that empowers users to automatically generate personalized painterly content that incorporates their aesthetic preferences in a customized artistic style. This approach involves utilizing ``semantic injection'' to customize an artist model in a specific artistic style, and further leveraging a genetic algorithm to optimize the prompt generation process through real-time iterative human feedback. By solely relying on the user's aesthetic evaluation and preference for the artist model-generated images, this approach creates the user a personalized model that encompasses their aesthetic preferences and the customized artistic style.


The Generative AI Paradox on Evaluation: What It Can Solve, It May Not Evaluate

arXiv.org Artificial Intelligence

This paper explores the assumption that Large Language Models (LLMs) skilled in generation tasks are equally adept as evaluators. We assess the performance of three LLMs and one open-source LM in Question-Answering (QA) and evaluation tasks using the TriviaQA (Joshi et al., 2017) dataset. Results indicate a significant disparity, with LLMs exhibiting lower performance in evaluation tasks compared to generation tasks. Intriguingly, we discover instances of unfaithful evaluation where models accurately evaluate answers in areas where they lack competence, underscoring the need to examine the faithfulness and trustworthiness of LLMs as evaluators. This study contributes to the understanding of "the Generative AI Paradox" (West et al., 2023), highlighting a need to explore the correlation between generative excellence and evaluation proficiency, and the necessity to scrutinize the faithfulness aspect in model evaluations.