Generative AI
A Comparative Benchmark of a Moroccan Darija Toxicity Detection Model (Typica.ai) and Major LLM-Based Moderation APIs (OpenAI, Mistral, Anthropic)
This paper presents a comparative benchmark evaluating the performance of Typica.ai's custom Moroccan Darija toxicity detection model against major LLM-based moderation APIs: OpenAI (omni-moderation-latest), Mistral (mistral-moderation-latest), and Anthropic Claude (claude-3-haiku-20240307). We focus on culturally grounded toxic content, including implicit insults, sarcasm, and culturally specific aggression often overlooked by general-purpose systems. Using a balanced test set derived from the OMCD_Typica.ai_Mix dataset, we report precision, recall, F1-score, and accuracy, offering insights into challenges and opportunities for moderation in underrepresented languages. Our results highlight Typica.ai's superior performance, underlining the importance of culturally adapted models for reliable content moderation.
DiffPattern-Flex: Efficient Layout Pattern Generation via Discrete Diffusion
Wang, Zixiao, Zhao, Wenqian, Shen, Yunheng, Bai, Yang, Chen, Guojin, Farnia, Farzan, Yu, Bei
--Recent advancements in layout pattern generation have been dominated by deep generative models. However, relying solely on neural networks for legality guarantees raises concerns in many practical applications. In this paper, we present DiffPattern-Flex, a novel approach designed to generate reliable layout patterns efficiently. DiffPattern-Flex incorporates a new method for generating diverse topologies using a discrete diffusion model while maintaining a lossless and compute-efficient layout representation. T o ensure legal pattern generation, we employ an optimization-based, white-box pattern assessment process based on specific design rules. Furthermore, fast sampling and efficient legalization technologies are employed to accelerate the generation process. Experimental results across various benchmarks demonstrate that DiffPattern-Flex significantly outperforms existing methods and excels at producing reliable layout patterns. ELIABLE very-large-scale integration (VLSI) layout pattern libraries form the backbone of various Design for Manufacturability (DFM) research, such as refining design rules [1]-[3], optimizing Optical Proximity Correction (OPC) techniques [4]-[6], performing lithography simulations [7]-[9], and detecting layout hotspots [10]-[12]. With the increasing demand for layout patterns in machine-learning-based lithography design, building a comprehensive and practical large-scale pattern library has become highly resource-intensive due to the extended logic-to-chip design cycle. To address this challenge, a variety of rule-based and learning-based layout pattern generation methods have been introduced. These units were then randomly selected and combined. However, this approach results in limited diversity and quantity of generated patterns. More recently, learning-based generative methods [15]-[19] have demonstrated the ability to produce diverse layout patterns at a larger scale. This work is supported by The Research Grants Council of Hong Kong SAR (No. CUHK14208021) and the MIND project (MINDXZ202404). Y unheng Shen is with Tsinghua University, Beijing, China.
A Survey of Slow Thinking-based Reasoning LLMs using Reinforced Learning and Inference-time Scaling Law
Pan, Qianjun, Ji, Wenkai, Ding, Yuyang, Li, Junsong, Chen, Shilian, Wang, Junyi, Zhou, Jie, Chen, Qin, Zhang, Min, Wu, Yulan, He, Liang
This survey explores recent advancements in reasoning large language models (LLMs) designed to mimic "slow thinking" - a reasoning process inspired by human cognition, as described in Kahneman's Thinking, Fast and Slow. These models, like OpenAI's o1, focus on scaling computational resources dynamically during complex tasks, such as math reasoning, visual reasoning, medical diagnosis, and multi-agent debates. We present the development of reasoning LLMs and list their key technologies. By synthesizing over 100 studies, it charts a path toward LLMs that combine human-like deep thinking with scalable efficiency for reasoning. The review breaks down methods into three categories: (1) test-time scaling dynamically adjusts computation based on task complexity via search and sampling, dynamic verification; (2) reinforced learning refines decision-making through iterative improvement leveraging policy networks, reward models, and self-evolution strategies; and (3) slow-thinking frameworks (e.g., long CoT, hierarchical processes) that structure problem-solving with manageable steps. The survey highlights the challenges and further directions of this domain. Understanding and advancing the reasoning abilities of LLMs is crucial for unlocking their full potential in real-world applications, from scientific discovery to decision support systems.
Natural Language Generation in Healthcare: A Review of Methods and Applications
Lyu, Mengxian, Li, Xiaohan, Chen, Ziyi, Pan, Jinqian, Peng, Cheng, Talankar, Sankalp, Wu, Yonghui
Natural language generation (NLG) is the key technology to achieve generative artificial intelligence (AI). With the breakthroughs in large language models (LLMs), NLG has been widely used in various medical applications, demonstrating the potential to enhance clinical workflows, support clinical decision-making, and improve clinical documentation. Heterogeneous and diverse medical data modalities, such as medical text, images, and knowledge bases, are utilized in NLG. Researchers have proposed many generative models and applied them in a number of healthcare applications. There is a need for a comprehensive review of NLG methods and applications in the medical domain. In this study, we systematically reviewed 113 scientific publications from a total of 3,988 NLG-related articles identified using a literature search, focusing on data modality, model architecture, clinical applications, and evaluation methods. Following PRISMA (Preferred Reporting Items for Systematic reviews and Meta-Analyses) guidelines, we categorize key methods, identify clinical applications, and assess their capabilities, limitations, and emerging challenges. This timely review covers the key NLG technologies and medical applications and provides valuable insights for future studies to leverage NLG to transform medical discovery and healthcare.
An Empirical Study of OpenAI API Discussions on Stack Overflow
Chen, Xiang, Wang, Jibin, Gao, Chaoyang, Ju, Xiaolin, Cui, Zhanqi
The rapid advancement of large language models (LLMs), represented by OpenAI's GPT series, has significantly impacted various domains such as natural language processing, software development, education, healthcare, finance, and scientific research. However, OpenAI APIs introduce unique challenges that differ from traditional APIs, such as the complexities of prompt engineering, token-based cost management, non-deterministic outputs, and operation as black boxes. To the best of our knowledge, the challenges developers encounter when using OpenAI APIs have not been explored in previous empirical studies. To fill this gap, we conduct the first comprehensive empirical study by analyzing 2,874 OpenAI API-related discussions from the popular Q&A forum Stack Overflow. We first examine the popularity and difficulty of these posts. After manually categorizing them into nine OpenAI API-related categories, we identify specific challenges associated with each category through topic modeling analysis. Based on our empirical findings, we finally propose actionable implications for developers, LLM vendors, and researchers.
Deepfakes on Demand: the rise of accessible non-consensual deepfake image generators
Hawkins, Will, Russell, Chris, Mittelstadt, Brent
Advances in multimodal machine learning have made text-to-image (T2I) models increasingly accessible and popular. However, T2I models introduce risks such as the generation of non-consensual depictions of identifiable individuals, otherwise known as deepfakes. This paper presents an empirical study exploring the accessibility of deepfake model variants online. Through a metadata analysis of thousands of publicly downloadable model variants on two popular repositories, Hugging Face and Civitai, we demonstrate a huge rise in easily accessible deepfake models. Almost 35,000 examples of publicly downloadable deepfake model variants are identified, primarily hosted on Civitai. These deepfake models have been downloaded almost 15 million times since November 2022, with the models targeting a range of individuals from global celebrities to Instagram users with under 10,000 followers. Both Stable Diffusion and Flux models are used for the creation of deepfake models, with 96% of these targeting women and many signalling intent to generate non-consensual intimate imagery (NCII). Deepfake model variants are often created via the parameter-efficient fine-tuning technique known as low rank adaptation (LoRA), requiring as few as 20 images, 24GB VRAM, and 15 minutes of time, making this process widely accessible via consumer-grade computers. Despite these models violating the Terms of Service of hosting platforms, and regulation seeking to prevent dissemination, these results emphasise the pressing need for greater action to be taken against the creation of deepfakes and NCII.
Beyond Misinformation: A Conceptual Framework for Studying AI Hallucinations in (Science) Communication
This paper proposes a conceptual framework for understanding AI hallucinations as a distinct form of misinformation. While misinformation scholarship has traditionally focused on human intent, generative AI systems now produce false yet plausible outputs absent of such intent. I argue that these AI hallucinations should not be treated merely as technical failures but as communication phenomena with social consequences. Drawing on a supply-and-demand model and the concept of distributed agency, the framework outlines how hallucinations differ from human-generated misinformation in production, perception, and institutional response. I conclude by outlining a research agenda for communication scholars to investigate the emergence, dissemination, and audience reception of hallucinated content, with attention to macro (institutional), meso (group), and micro (individual) levels. This work urges communication researchers to rethink the boundaries of misinformation theory in light of probabilistic, non-human actors increasingly embedded in knowledge production.
This man was killed four years ago. His AI clone just spoke in court.
People just can't stop using generative AI tools in legal proceedings, despite repeated pushback from frustrated judges. While AI initially appeared in courtrooms through bogus "hallucinated" cases the trend has taken a turn--driven by increasingly sophisticated AI video and audio tools. In some instances, AI is even being used to seemingly bring victims back from the dead. This week, a crime victim's family presented a brief video in an Arizona courtroom depicting an AI version of 37-year-old Chris Pelkey. Pelkey was shot and killed in 2021 in a road rage incident. Now, four years later, the AI-generated "clone" appeared to address his alleged killer in court.
OpenAI and the FDA Are Holding Talks About Using AI In Drug Evaluation
The Food and Drug Administration has been meeting with OpenAI to discuss the agency's use of AI, according to sources with knowledge of the meetings. The meetings appear to be part of a broader effort at the FDA to use this technology to speed up the drug approval process. "Why does it take over 10 years for a new drug to come to market?" "Why are we not modernized with AI and other things? We've just completed our first AI-assisted scientific review for a product and that's just the beginning."
After criticism, OpenAI shelves plans to become a for-profit company
OpenAI, the company that develops ChatGPT, has decided to cancel its plans to transform the organization into a for-profit company. Instead, the non-profit organization that founded OpenAI will continue to run the business as before. The for-profit plans, announced in December 2024, were justified at the time by a need to secure sufficient capital to keep developing expensive artificial general intelligence (AGI). Now, instead of a full conversion to a for-profit company, OpenAI's for-profit LLC will be transformed into a Public Benefit Corporation (PBC), which is a type of US company that's beholden to both its shareholders and its purpose-driven mission. The existing OpenAI non-profit organization will retain control of the PBC and become one of its largest shareholders.