Generative AI
OpenAI signs multi-year content partnership with Condé Nast
Condé Nast and OpenAI announced a multi-year partnership on Tuesday to display content from the publisher's brands such as the Vogue, Wired and the New Yorker within the AI startup's products, including ChatGPT and its SearchGPT prototype. The financial terms of the deal were not disclosed. The Microsoft-backed, Sam Altman-led firm has signed similar deals with Time magazine, the Financial Times, Business Insider owner Axel Springer, France's Le Monde and Spain's Prisa Media over the past few months. The deals give OpenAI access to the large archives of text owned by the publishers, which are necessary both for training large language models like ChatGPT and for finding real-time information. OpenAI launched its AI-powered search engine SearchGPT in July, with real-time access to information from the internet, making an incursion on territory long dominated by Google.
OpenAI will now use content from Wired, Vogue and The New Yorker in ChatGPT's responses
Condé Nast, the media conglomerate that owns publications like The New Yorker, Vogue and Wired, has announced a multi-year partnership OpenAI to display content from Condé Nast titles in ChatGPT as well as SearchGPT, the company's prototype AI-powered search engine. The partnership comes amid growing concerns over the unauthorized use of publishers' content by AI companies. Last month, Condé Nast sent a cease-and-desist letter to AI search startup Perplexity, accusing it of plagiarism for using its content to generate answers. "Over the last decade, news and digital media have faced steep challenges as many technology companies eroded publishers' ability to monetize content, most recently with traditional search," Condé Nast CEO Roger Lynch wrote to employees in a memo that was first reported by Semafor's Max Tani. "Our partnership with OpenAI begins to make up for some of that revenue, allowing us to continue to protect and invest in our journalism and creative endeavors."
Condé Nast Signs Deal With OpenAI
Condé Nast and OpenAI have struck a multi-year deal that will allow the AI giant to use content from the media giant's roster of properties--which includes the New Yorker, Vogue, Vanity Fair, Bon Appetit, and, yes, WIRED. The deal will allow OpenAI to surface stories from these outlets in both ChatGPT and the new SearchGPT prototype. "It's crucial that we meet audiences where they are and embrace new technologies while also ensuring proper attribution and compensation for use of our intellectual property," Condé Nast CEO Roger Lynch wrote in a company-wide email. Lynch pointed to ongoing turmoil within the publishing industry while discussing the deal, noting that technology companies have made it harder for publishers to make money, most recently with changes to traditional search. "Our partnership with OpenAI begins to make up for some of that revenue, allowing us to continue to protect and invest in our journalism and creative endeavors," he wrote.
Authors sue Anthropic for copyright infringement over AI training
The artificial intelligence company Anthropic has been hit with a class-action lawsuit in California federal court by three authors who say it misused their books and hundreds of thousands of others to train its AI-powered chatbot Claude, which generates texts in response to users' prompts. The complaint, filed on Monday by writers and journalists Andrea Bartz, Charles Graeber and Kirk Wallace Johnson, said that Anthropic used pirated versions of their works and others to teach Claude to respond to human prompts. "Anthropic styles itself as a public benefit company, designed to improve humanity. "It is no exaggeration to say that Anthropic's model seeks to profit from strip-mining the human expression and ingenuity behind each one of those works." Separate groups of authors have sued OpenAI and Meta Platforms over the companies' alleged misuse of their work to train the large-language models underlying their chatbots.
Can AI be used ethically for school work? Here's what teachers say
Can AI be used ethically for school work? It depends upon who you ask -- quite literally. That's because less than two years after ChatGPT was originally released in November 2022, the attitudes towards AI in the classroom still vary widely. High schools have viewed AI as a crutch at best, and at worst as a tool for cheating. But several universities leave generative AI use entirely up to the discretion of the person teaching the course.
Applying and Evaluating Large Language Models in Mental Health Care: A Scoping Review of Human-Assessed Generative Tasks
Hua, Yining, Na, Hongbin, Li, Zehan, Liu, Fenglin, Fang, Xiao, Clifton, David, Torous, John
Large language models (LLMs) are emerging as promising tools for mental health care, offering scalable support through their ability to generate human-like responses. However, the effectiveness of these models in clinical settings remains unclear. This scoping review aimed to assess the current generative applications of LLMs in mental health care, focusing on studies where these models were tested with human participants in real-world scenarios. A systematic search across APA PsycNet, Scopus, PubMed, and Web of Science identified 726 unique articles, of which 17 met the inclusion criteria. These studies encompassed applications such as clinical assistance, counseling, therapy, and emotional support. However, the evaluation methods were often non-standardized, with most studies relying on ad hoc scales that limit comparability and robustness. Privacy, safety, and fairness were also frequently underexplored. Moreover, reliance on proprietary models, such as OpenAI's GPT series, raises concerns about transparency and reproducibility. While LLMs show potential in expanding mental health care access, especially in underserved areas, the current evidence does not fully support their use as standalone interventions. More rigorous, standardized evaluations and ethical oversight are needed to ensure these tools can be safely and effectively integrated into clinical practice.
Fine-Tuning and Deploying Large Language Models Over Edges: Issues and Approaches
Dong, Yanjie, Fan, Xiaoyi, Wang, Fangxin, Li, Chengming, Leung, Victor C. M., Hu, Xiping
Since the invention of GPT2--1.5B in 2019, large language models (LLMs) have transitioned from specialized models to versatile foundation models. The LLMs exhibit impressive zero-shot ability, however, require fine-tuning on local datasets and significant resources for deployment. Traditional fine-tuning techniques with the first-order optimizers require substantial GPU memory that exceeds mainstream hardware capability. Therefore, memory-efficient methods are motivated to be investigated. Model compression techniques can reduce energy consumption, operational costs, and environmental impact so that to support sustainable artificial intelligence advancements. Additionally, large-scale foundation models have expanded to create images, audio, videos, and multi-modal contents, further emphasizing the need for efficient deployment. Therefore, we are motivated to present a comprehensive overview of the prevalent memory-efficient fine-tuning methods over the network edge. We also review the state-of-the-art literatures on model compression to provide a vision on deploying LLMs over the network edge.
A Population-to-individual Tuning Framework for Adapting Pretrained LM to On-device User Intent Prediction
Gong, Jiahui, Ding, Jingtao, Meng, Fanjin, Chen, Guilong, Chen, Hong, Zhao, Shen, Lu, Haisheng, Li, Yong
Mobile devices, especially smartphones, can support rich functions and have developed into indispensable tools in daily life. With the rise of generative AI services, smartphones can potentially transform into personalized assistants, anticipating user needs and scheduling services accordingly. Predicting user intents on smartphones, and reflecting anticipated activities based on past interactions and context, remains a pivotal step towards this vision. Existing research predominantly focuses on specific domains, neglecting the challenge of modeling diverse event sequences across dynamic contexts. Leveraging pre-trained language models (PLMs) offers a promising avenue, yet adapting PLMs to on-device user intent prediction presents significant challenges. To address these challenges, we propose PITuning, a Population-to-Individual Tuning framework. PITuning enhances common pattern extraction through dynamic event-to-intent transition modeling and addresses long-tailed preferences via adaptive unlearning strategies. Experimental results on real-world datasets demonstrate PITuning's superior intent prediction performance, highlighting its ability to capture long-tailed preferences and its practicality for on-device prediction scenarios.
Anim-Director: A Large Multimodal Model Powered Agent for Controllable Animation Video Generation
Li, Yunxin, Shi, Haoyuan, Hu, Baotian, Wang, Longyue, Zhu, Jiashun, Xu, Jinyi, Zhao, Zhen, Zhang, Min
Traditional animation generation methods depend on training generative models with human-labelled data, entailing a sophisticated multi-stage pipeline that demands substantial human effort and incurs high training costs. Due to limited prompting plans, these methods typically produce brief, information-poor, and context-incoherent animations. To overcome these limitations and automate the animation process, we pioneer the introduction of large multimodal models (LMMs) as the core processor to build an autonomous animation-making agent, named Anim-Director. This agent mainly harnesses the advanced understanding and reasoning capabilities of LMMs and generative AI tools to create animated videos from concise narratives or simple instructions. Specifically, it operates in three main stages: Firstly, the Anim-Director generates a coherent storyline from user inputs, followed by a detailed director's script that encompasses settings of character profiles and interior/exterior descriptions, and context-coherent scene descriptions that include appearing characters, interiors or exteriors, and scene events. Secondly, we employ LMMs with the image generation tool to produce visual images of settings and scenes. These images are designed to maintain visual consistency across different scenes using a visual-language prompting method that combines scene descriptions and images of the appearing character and setting. Thirdly, scene images serve as the foundation for producing animated videos, with LMMs generating prompts to guide this process. The whole process is notably autonomous without manual intervention, as the LMMs interact seamlessly with generative tools to generate prompts, evaluate visual quality, and select the best one to optimize the final output.
Detecting the Undetectable: Combining Kolmogorov-Arnold Networks and MLP for AI-Generated Image Detection
Anon, Taharim Rahman, Emon, Jakaria Islam
As artificial intelligence progresses, the task of distinguishing between real and AI-generated images is increasingly complicated by sophisticated generative models. This paper presents a novel detection framework adept at robustly identifying images produced by cutting-edge generative AI models, such as DALL-E 3, MidJourney, and Stable Diffusion 3. We introduce a comprehensive dataset, tailored to include images from these advanced generators, which serves as the foundation for extensive evaluation. we propose a classification system that integrates semantic image embeddings with a traditional Multilayer Perceptron (MLP). This baseline system is designed to effectively differentiate between real and AI-generated images under various challenging conditions. Enhancing this approach, we introduce a hybrid architecture that combines Kolmogorov-Arnold Networks (KAN) with the MLP. This hybrid model leverages the adaptive, high-resolution feature transformation capabilities of KAN, enabling our system to capture and analyze complex patterns in AI-generated images that are typically overlooked by conventional models. In out-of-distribution testing, our proposed model consistently outperformed the standard MLP across three out of distribution test datasets, demonstrating superior performance and robustness in classifying real images from AI-generated images with impressive F1 scores.