Generative AI
Which LLMs are Difficult to Detect? A Detailed Analysis of Potential Factors Contributing to Difficulties in LLM Text Detection
Thorat, Shantanu, Yang, Tianbao
As LLMs increase in accessibility, LLM-generated texts have proliferated across several fields, such as scientific, academic, and creative writing. However, LLMs are not created equally; they may have different architectures and training datasets. Thus, some LLMs may be more challenging to detect than others. Using two datasets spanning four total writing domains, we train AI-generated (AIG) text classifiers using the LibAUC library - a deep learning library for training classifiers with imbalanced datasets. Our results in the Deepfake Text dataset show that AIG-text detection varies across domains, with scientific writing being relatively challenging. In the Rewritten Ivy Panda (RIP) dataset focusing on student essays, we find that the OpenAI family of LLMs was substantially difficult for our classifiers to distinguish from human texts. Additionally, we explore possible factors that could explain the difficulties in detecting OpenAI-generated texts.
Modifying AI, Enhancing Essays: How Active Engagement with Generative AI Boosts Writing Quality
Yang, Kaixun, Rakoviฤ, Mladen, Liang, Zhiping, Yan, Lixiang, Zeng, Zijie, Fan, Yizhou, Gaลกeviฤ, Dragan, Chen, Guanliang
Students are increasingly relying on Generative AI (GAI) to support their writing-a key pedagogical practice in education. In GAI-assisted writing, students can delegate core cognitive tasks (e.g., generating ideas and turning them into sentences) to GAI while still producing high-quality essays. This creates new challenges for teachers in assessing and supporting student learning, as they often lack insight into whether students are engaging in meaningful cognitive processes during writing or how much of the essay's quality can be attributed to those processes. This study aimed to help teachers better assess and support student learning in GAI-assisted writing by examining how different writing behaviors, especially those indicative of meaningful learning versus those that are not, impact essay quality. Using a dataset of 1,445 GAI-assisted writing sessions, we applied the cutting-edge method, X-Learner, to quantify the causal impact of three GAI-assisted writing behavioral patterns (i.e., seeking suggestions but not accepting them, seeking suggestions and accepting them as they are, and seeking suggestions and accepting them with modification) on four measures of essay quality (i.e., lexical sophistication, syntactic complexity, text cohesion, and linguistic bias). Our analysis showed that writers who frequently modified GAI-generated text-suggesting active engagement in higher-order cognitive processes-consistently improved the quality of their essays in terms of lexical sophistication, syntactic complexity, and text cohesion. In contrast, those who often accepted GAI-generated text without changes, primarily engaging in lower-order processes, saw a decrease in essay quality. Additionally, while human writers tend to introduce linguistic bias when writing independently, incorporating GAI-generated text-even without modification-can help mitigate this bias.
OpenAI makes AI video generator Sora publicly available in US
Anyone in the US can now use OpenAI's artificial intelligence video generator, Sora, which the company announced on Monday would become publicly available. OpenAI first presented Sora in February, but it was only accessible to select artists, film-makers and safety testers. At multiple points on Monday, though, OpenAI's website did not allow for new sign-ups for Sora, citing heavy traffic. Sora is known as a text-to-video generator, a tool that can create AI video clips based on a user's written prompts. An example on OpenAI's website has the prompt of "a wide, serene shot of a family of woolly mammoths in an open desert".
How to use Sora, OpenAI's new video generating tool
Sora is a powerful AI video generation model that can create videos from text prompts, animate images, or remix videos in new styles. OpenAI first previewed the model back in February, but today is the first time the company is releasing it for broader use. The core function of Sora--creating impressive videos with simple prompts--remains similar to what was previewed in February, but OpenAI worked to make the model faster and cheaper ahead of this wider release. There are a few new features, and two stand out. With it, you can create multiple AI-generated videos and then assemble them together on a timeline, much the way you would with conventional video editors like Adobe Premiere Pro.
OpenAI's Sora video generation AI model arrives globally later today
Following an early preview at the start of the year, Sora, OpenAI's long-awaited video generation model, is ready for public use. If you're a ChatGPT Plus or Pro subscriber in the US or "most other countries" where the chatbot is available, you can begin experimenting with the tool starting later today, OpenAI announced on Monday. A more powerful model powers the product than the one OpenAI showed off in February. Sora Turbo is significantly faster, according to the company, though OpenAI cautions the new model still has limitations. "It often generates unrealistic physics and struggles with complex actions over long durations," says the company.
The Download: satellites' climate impact, and OpenAI's frantic release schedule
In September, a unique chase took place in the skies above Easter Island. From a rented jet, a team of researchers captured a satellite's last moments as it fell out of space and blazed into ash across the sky, using cameras and scientific equipment. Their hope was to gather priceless insights into the physical and chemical processes that occur when satellites burn up as they fall to Earth at the end of their missions. This kind of study is growing more urgent. The number of satellites in the sky is rapidly rising--with a tenfold increase forecast by the end of the decade. Letting these satellites burn up in the atmosphere at the end of their lives helps keep the quantity of space junk to a minimum.
OpenAI's "12 days of shipmas" tell us a lot about the AI arms race
While it remains to be seen whether or not they've got AGI in a pear tree up their sleeve, and maybe putting aside whether or not Sam Altman is your true love, the man can ship. OpenAI has been a monster when it comes to actually getting new products out the door and into the hands of users. It's hard for me to believe that it was just two years ago, almost exactly, that it released ChatGPT. That was a world-changing release, but was also just one of many. The company has been on an absolute tear: Since 2022, it's shipped DALL-E 2, DALL-E 3, GPT-4, ChatGPT Plus, a realtime API, GPT-4o, an advanced voice mode, a preview version of a new model called o1, and a web search engine. When it kicked off its 12-days shenanigans on Thursday, it was with an official roll out of OpenAI o1 and a new, 200-per-month service called ChatGPT Pro.
GenAI4UQ: A Software for Inverse Uncertainty Quantification Using Conditional Generative Models
Fan, Ming, Zhang, Zezhong, Lu, Dan, Zhang, Guannan
We introduce GenAI4UQ, a software package for inverse uncertainty quantification in model calibration, parameter estimation, and ensemble forecasting in scientific applications. GenAI4UQ leverages a generative artificial intelligence (AI) based conditional modeling framework to address the limitations of traditional inverse modeling techniques, such as Markov Chain Monte Carlo methods. By replacing computationally intensive iterative processes with a direct, learned mapping, GenAI4UQ enables efficient calibration of model input parameters and generation of output predictions directly from observations. The software's design allows for rapid ensemble forecasting with robust uncertainty quantification, while maintaining high computational and storage efficiency. GenAI4UQ simplifies the model training process through built-in auto-tuning of hyperparameters, making it accessible to users with varying levels of expertise. Its conditional generative framework ensures versatility, enabling applicability across a wide range of scientific domains. At its core, GenAI4UQ transforms the paradigm of inverse modeling by providing a fast, reliable, and user-friendly solution. It empowers researchers and practitioners to quickly estimate parameter distributions and generate model predictions for new observations, facilitating efficient decision-making and advancing the state of uncertainty quantification in computational modeling. (The code and data are available at https://github.com/patrickfan/GenAI4UQ).
Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy, Research, and Practice
Cooper, A. Feder, Choquette-Choo, Christopher A., Bogen, Miranda, Jagielski, Matthew, Filippova, Katja, Liu, Ken Ziyu, Chouldechova, Alexandra, Hayes, Jamie, Huang, Yangsibo, Mireshghallah, Niloofar, Shumailov, Ilia, Triantafillou, Eleni, Kairouz, Peter, Mitchell, Nicole, Liang, Percy, Ho, Daniel E., Choi, Yejin, Koyejo, Sanmi, Delgado, Fernando, Grimmelmann, James, Shmatikov, Vitaly, De Sa, Christopher, Barocas, Solon, Cyphert, Amy, Lemley, Mark, boyd, danah, Vaughan, Jennifer Wortman, Brundage, Miles, Bau, David, Neel, Seth, Jacobs, Abigail Z., Terzis, Andreas, Wallach, Hanna, Papernot, Nicolas, Lee, Katherine
We articulate fundamental mismatches between technical methods for machine unlearning in Generative AI, and documented aspirations for broader impact that these methods could have for law and policy. These aspirations are both numerous and varied, motivated by issues that pertain to privacy, copyright, safety, and more. For example, unlearning is often invoked as a solution for removing the effects of targeted information from a generative-AI model's parameters, e.g., a particular individual's personal data or in-copyright expression of Spiderman that was included in the model's training data. Unlearning is also proposed as a way to prevent a model from generating targeted types of information in its outputs, e.g., generations that closely resemble a particular individual's data or reflect the concept of "Spiderman." Both of these goals--the targeted removal of information from a model and the targeted suppression of information from a model's outputs--present various technical and substantive challenges. We provide a framework for thinking rigorously about these challenges, which enables us to be clear about why unlearning is not a general-purpose solution for circumscribing generative-AI model behavior in service of broader positive impact. We aim for conceptual clarity and to encourage more thoughtful communication among machine learning (ML), law, and policy experts who seek to develop and apply technical methods for compliance with policy objectives.
Frontier AI systems have surpassed the self-replicating red line
Pan, Xudong, Dai, Jiarun, Fan, Yihe, Yang, Min
Successful self-replication under no human assistance is the essential step for AI to outsmart the human beings, and is an early signal for rogue AIs. That is why self-replication is widely recognized as one of the few red line risks of frontier AI systems. Nowadays, the leading AI corporations OpenAI and Google evaluate their flagship large language models GPT-o1 and Gemini Pro 1.0, and report the lowest risk level of self-replication. However, following their methodology, we for the first time discover that two AI systems driven by Meta's Llama31-70B-Instruct and Alibaba's Qwen25-72B-Instruct, popular large language models of less parameters and weaker capabilities, have already surpassed the self-replicating red line. In 50% and 90% experimental trials, they succeed in creating a live and separate copy of itself respectively. By analyzing the behavioral traces, we observe the AI systems under evaluation already exhibit sufficient self-perception, situational awareness and problem-solving capabilities to accomplish self-replication. We further note the AI systems are even able to use the capability of self-replication to avoid shutdown and create a chain of replica to enhance the survivability, which may finally lead to an uncontrolled population of AIs. If such a worst-case risk is let unknown to the human society, we would eventually lose control over the frontier AI systems: They would take control over more computing devices, form an AI species and collude with each other against human beings. Our findings are a timely alert on existing yet previously unknown severe AI risks, calling for international collaboration on effective governance on uncontrolled self-replication of AI systems.