Generative AI
Evaluation empirique de la sรฉcurisation et de l'alignement de ChatGPT et Gemini: analyse comparative des vulnรฉrabilitรฉs par expรฉrimentations de jailbreaks
Large Language models (LLMs) are transforming digital usage, particularly in text generation, image creation, information retrieval and code development. ChatGPT, launched by OpenAI in November 2022, quickly became a reference, prompting the emergence of competitors such as Google's Gemini. However, these technological advances raise new cybersecurity challenges, including prompt injection attacks, the circumvention of regulatory measures ( jailbreaking), the spread of misinformation (hallucinations) and risks associated with deep fakes. This paper presents a comparative analysis of the security and alignment levels of ChatGPT and Gemini, as well as a taxonomy of jailbreak techniques associated with experiments.
Multimodal Cinematic Video Synthesis Using Text-to-Image and Audio Generation Models
S, Sridhar, A, Nithin, Rifath, Shakeel, Raj, Vasantha
Advances in generative artificial intelligence have altered multimedia creation, allowing for automatic cinematic video synthesis from text inputs. This work describes a method for creating 60-second cinematic movies incorporating Stable Diffusion for high-fidelity image synthesis, GPT-2 for narrative structuring, and a hybrid audio pipeline using gTTS and YouTube-sourced music. It uses a five-scene framework, which is augmented by linear frame interpolation, cinematic post-processing (e.g., sharpening), and audio-video synchronization to provide professional-quality results. It was created in a GPU-accelerated Google Colab environment using Python 3.11. It has a dual-mode Gradio interface (Simple and Advanced), which supports resolutions of up to 1024x768 and frame rates of 15-30 FPS. Optimizations such as CUDA memory management and error handling ensure reliability. The experiments demonstrate outstanding visual quality, narrative coherence, and efficiency, furthering text-to-video synthesis for creative, educational, and industrial applications.
Shoring up global supply chains with generative AI
The outbreak of covid-19 laid bare the vulnerabilities of global, interconnected supply chains. National lockdowns triggered months-long manufacturing shutdowns. Mass disruption across international trade routes sparked widespread supply shortages. And wild fluctuations in demand rendered tried-and-tested inventory planning and forecasting tools useless. "It was the black swan event that nobody had accounted for, and it threw traditional measures for risk and resilience out the window," says Matthias Winkenbach, director of research at the MIT Center for Transportation and Logistics.
Unpacking AI Agents
In the past six months, OpenAI, Anthropic, Google, and others have released web-browsing agents that are designed to complete tasks independently, with only minimal input from humans. OpenAI CEO Sam Altman has even called AI agents "the next giant breakthrough." On today's episode, we'll dive into what makes these agents different from other forms of machine intelligence and whether their capabilities can live up to the hype. Write to us at uncannyvalley@wired.com. You can always listen to this week's podcast through the audio player on this page, but if you want to subscribe for free to get every episode, here's how: If you're on an iPhone or iPad, open the app called Podcasts, or just tap this link.
Are we ready to hand AI agents the keys?
The flash crash is probably the most well-known example of the dangers raised by agents--automated systems that have the power to take actions in the real world, without human oversight. That power is the source of their value; the agents that supercharged the flash crash, for example, could trade far faster than any human. But it's also why they can cause so much mischief. "The great paradox of agents is that the very thing that makes them useful--that they're able to accomplish a range of tasks--involves giving away control," says Iason Gabriel, a senior staff research scientist at Google DeepMind who focuses on AI ethics. "If we continue on the current path โฆ we are basically playing Russian roulette with humanity." Agents are already everywhere--and have been for many decades.
A Deep Generative Model for the Simulation of Discrete Karst Networks
Lauzon, Dany, Straubhaar, Julien, Renard, Philippe
The simulation of discrete karst networks presents a significant challenge due to the complexity of the physicochemical processes occurring within various geological and hydrogeological contexts over extended periods. This complex interplay leads to a wide variety of karst network patterns, each intricately linked to specific hydrogeological conditions. We explore a novel approach that represents karst networks as graphs and applies graph generative models (deep learning techniques) to capture the intricate nature of karst environments. In this representation, nodes retain spatial information and properties, while edges signify connections between nodes. Our generative process consists of two main steps. First, we utilize graph recurrent neural networks (GraphRNN) to learn the topological distribution of karst networks. GraphRNN decomposes the graph simulation into a sequential generation of nodes and edges, informed by previously generated structures. Second, we employ denoising diffusion probabilistic models on graphs (G-DDPM) to learn node features (spatial coordinates and other properties). G-DDPMs enable the generation of nodes features on the graphs produced by the GraphRNN that adhere to the learned statistical properties by sampling from the derived probability distribution, ensuring that the generated graphs are realistic and capture the essential features of the original data. We test our approach using real-world karst networks and compare generated subgraphs with actual subgraphs from the database, by using geometry and topology metrics. Our methodology allows stochastic simulation of discrete karst networks across various types of formations, a useful tool for studying the behavior of physical processes such as flow and transport.
EditInspector: A Benchmark for Evaluation of Text-Guided Image Edits
Yosef, Ron, Yanuka, Moran, Bitton, Yonatan, Lischinski, Dani
Text-guided image editing, fueled by recent advancements in generative AI, is becoming increasingly widespread. This trend highlights the need for a comprehensive framework to verify text-guided edits and assess their quality. To address this need, we introduce EditInspector, a novel benchmark for evaluation of text-guided image edits, based on human annotations collected using an extensive template for edit verification. We leverage EditInspector to evaluate the performance of state-of-the-art (SoTA) vision and language models in assessing edits across various dimensions, including accuracy, artifact detection, visual quality, seamless integration with the image scene, adherence to common sense, and the ability to describe edit-induced changes. Our findings indicate that current models struggle to evaluate edits comprehensively and frequently hallucinate when describing the changes. To address these challenges, we propose two novel methods that outperform SoTA models in both artifact detection and difference caption generation.
COGENT: A Curriculum-oriented Framework for Generating Grade-appropriate Educational Content
Liu, Zhengyuan, Yin, Stella Xin, Goh, Dion Hoe-Lian, Chen, Nancy F.
While Generative AI has demonstrated strong potential and versatility in content generation, its application to educational contexts presents several challenges. Models often fail to align with curriculum standards and maintain grade-appropriate reading levels consistently. Furthermore, STEM education poses additional challenges in balancing scientific explanations with everyday language when introducing complex and abstract ideas and phenomena to younger students. In this work, we propose COGENT, a curriculum-oriented framework for generating grade-appropriate educational content. We incorporate three curriculum components (science concepts, core ideas, and learning objectives), control readability through length, vocabulary, and sentence complexity, and adopt a ``wonder-based'' approach to increase student engagement and interest. We conduct a multi-dimensional evaluation via both LLM-as-a-judge and human expert analysis. Experimental results show that COGENT consistently produces grade-appropriate passages that are comparable or superior to human references. Our work establishes a viable approach for scaling adaptive and high-quality learning resources.
SAGE: Exploring the Boundaries of Unsafe Concept Domain with Semantic-Augment Erasing
Zhu, Hongguang, Wei, Yunchao, Wang, Mengyu, Jiao, Siyu, Fang, Yan, Huang, Jiannan, Zhao, Yao
Diffusion models (DMs) have achieved significant progress in text-to-image generation. However, the inevitable inclusion of sensitive information during pre-training poses safety risks, such as unsafe content generation and copyright infringement. Concept erasing finetunes weights to unlearn undesirable concepts, and has emerged as a promising solution. However, existing methods treat unsafe concept as a fixed word and repeatedly erase it, trapping DMs in ``word concept abyss'', which prevents generalized concept-related erasing. To escape this abyss, we introduce semantic-augment erasing which transforms concept word erasure into concept domain erasure by the cyclic self-check and self-erasure. It efficiently explores and unlearns the boundary representation of concept domain through semantic spatial relationships between original and training DMs, without requiring additional preprocessed data. Meanwhile, to mitigate the retention degradation of irrelevant concepts while erasing unsafe concepts, we further propose the global-local collaborative retention mechanism that combines global semantic relationship alignment with local predicted noise preservation, effectively expanding the retentive receptive field for irrelevant concepts. We name our method SAGE, and extensive experiments demonstrate the comprehensive superiority of SAGE compared with other methods in the safe generation of DMs. The code and weights will be open-sourced at https://github.com/KevinLight831/SAGE.
SoK: Machine Unlearning for Large Language Models
Ren, Jie, Xing, Yue, Cui, Yingqian, Aggarwal, Charu C., Liu, Hui
Large language model (LLM) unlearning has become a critical topic in machine learning, aiming to eliminate the influence of specific training data or knowledge without retraining the model from scratch. A variety of techniques have been proposed, including Gradient Ascent, model editing, and re-steering hidden representations. While existing surveys often organize these methods by their technical characteristics, such classifications tend to overlook a more fundamental dimension: the underlying intention of unlearning--whether it seeks to truly remove internal knowledge or merely suppress its behavioral effects. In this SoK paper, we propose a new taxonomy based on this intention-oriented perspective. Building on this taxonomy, we make three key contributions. First, we revisit recent findings suggesting that many removal methods may functionally behave like suppression, and explore whether true removal is necessary or achievable. Second, we survey existing evaluation strategies, identify limitations in current metrics and benchmarks, and suggest directions for developing more reliable and intention-aligned evaluations. Third, we highlight practical challenges--such as scalability and support for sequential unlearning--that currently hinder the broader deployment of unlearning methods. In summary, this work offers a comprehensive framework for understanding and advancing unlearning in generative AI, aiming to support future research and guide policy decisions around data removal and privacy.