Generative AI
What this futuristic Olympics video says about the state of generative AI
With each shot requiring a new set of prompts, it's also hard to instill a sense of continuity throughout a video. The color, angle of the sun, and shapes of buildings are difficult for a video generation model to keep consistent. The video also lacks any close-ups of people, which Kahn says AI models still tend to struggle with. "These technologies are always better on large-scale things right now as opposed to really nuanced human interaction," he says. For this reason, Kahn imagines that early filmmaking applications of generative video might be for wide shots of landscapes or crowds.
Speech Foundation Model Ensembles for the Controlled Singing Voice Deepfake Detection (CtrSVDD) Challenge 2024
Guragain, Anmol, Liu, Tianchi, Pan, Zihan, Sailor, Hardik B., Wang, Qiongqiong
This work details our approach to achieving a leading system with a 1.79% pooled equal error rate (EER) on the evaluation set of the Controlled Singing Voice Deepfake Detection (CtrSVDD). The rapid advancement of generative AI models presents significant challenges for detecting AI-generated deepfake singing voices, attracting increased research attention. The Singing Voice Deepfake Detection (SVDD) Challenge 2024 aims to address this complex task. In this work, we explore the ensemble methods, utilizing speech foundation models to develop robust singing voice anti-spoofing systems. We also introduce a novel Squeeze-and-Excitation Aggregation (SEA) method, which efficiently and effectively integrates representation features from the speech foundation models, surpassing the performance of our other individual systems. Evaluation results confirm the efficacy of our approach in detecting deepfake singing voices. The codes can be accessed at https://github.com/Anmol2059/SVDD2024.
AI Governance in Higher Education: Case Studies of Guidance at Big Ten Universities
Wu, Chuhao, Zhang, He, Carroll, John M.
Generative AI has drawn significant attention from stakeholders in higher education. As it introduces new opportunities for personalized learning and tutoring support, it simultaneously poses challenges to academic integrity and leads to ethical issues. Consequently, governing responsible AI usage within higher education institutions (HEIs) becomes increasingly important. Leading universities have already published guidelines on Generative AI, with most attempting to embrace this technology responsibly. This study provides a new perspective by focusing on strategies for responsible AI governance as demonstrated in these guidelines. Through a case study of 14 prestigious universities in the United States, we identified the multi-unit governance of AI, the role-specific governance of AI, and the academic characteristics of AI governance from their AI guidelines. The strengths and potential limitations of these strategies and characteristics are discussed. The findings offer practical implications for guiding responsible AI usage in HEIs and beyond.
Data-driven topology design based on principal component analysis for 3D structural design problems
Yang, Jun, Yaji, Kentaro, Yamasaki, Shintaro
Topology optimization is a structural design methodology widely utilized to address engineering challenges. However, sensitivity-based topology optimization methods struggle to solve optimization problems characterized by strong non-linearity. Leveraging the sensitivity-free nature and high capacity of deep generative models, data-driven topology design (DDTD) methodology is considered an effective solution to this problem. Despite this, the training effectiveness of deep generative models diminishes when input size exceeds a threshold while maintaining high degrees of freedom is crucial for accurately characterizing complex structures. To resolve the conflict between the both, we propose DDTD based on principal component analysis (PCA). Its core idea is to replace the direct training of deep generative models with material distributions by using a principal component score matrix obtained from PCA computation and to obtain the generated material distributions with new features through the restoration process. We apply the proposed PCA-based DDTD to the problem of minimizing the maximum stress in 3D structural mechanics and demonstrate it can effectively address the current challenges faced by DDTD that fail to handle 3D structural design problems. Various experiments are conducted to demonstrate the effectiveness and practicability of the proposed PCA-based DDTD.
AI-Fakes Detection Is Failing Voters in the Global South
Recently, former president and convicted felon Donald Trump posted a series of photos that appeared to show fans of pop star Taylor Swift supporting his bid for the US presidency. The pictures looked AI-generated, and WIRED was able to confirm they probably were by running them through the nonprofit True Media's detection tool to confirm that they showed "substantial evidence of manipulation." The use of generative AI, including for political purposes, has become increasingly common, and WIRED has been tracking its use in elections around the world. But in much of the world outside the US and parts of Europe, detecting AI-generated content is difficult because of biases in the training of systems, leaving journalists and researchers with few resources to address the deluge of disinformation headed their way. Detecting media generated or manipulated using AI is still a burgeoning field, a response to the sudden explosion of generative AI companies.
A Grey-box Attack against Latent Diffusion Model-based Image Editing by Posterior Collapse
Guo, Zhongliang, Fang, Lei, Lin, Jingyu, Qian, Yifei, Zhao, Shuai, Wang, Zeyu, Dong, Junhao, Chen, Cunjian, Arandjelović, Ognjen, Lau, Chun Pong
Recent advancements in generative AI, particularly Latent Diffusion Models (LDMs), have revolutionized image synthesis and manipulation. However, these generative techniques raises concerns about data misappropriation and intellectual property infringement. Adversarial attacks on machine learning models have been extensively studied, and a well-established body of research has extended these techniques as a benign metric to prevent the underlying misuse of generative AI. Current approaches to safeguarding images from manipulation by LDMs are limited by their reliance on model-specific knowledge and their inability to significantly degrade semantic quality of generated images. In response to these shortcomings, we propose the Posterior Collapse Attack (PCA) based on the observation that VAEs suffer from posterior collapse during training. Our method minimizes dependence on the white-box information of target models to get rid of the implicit reliance on model-specific knowledge. By accessing merely a small amount of LDM parameters, in specific merely the VAE encoder of LDMs, our method causes a substantial semantic collapse in generation quality, particularly in perceptual consistency, and demonstrates strong transferability across various model architectures. Experimental results show that PCA achieves superior perturbation effects on image generation of LDMs with lower runtime and VRAM. Our method outperforms existing techniques, offering a more robust and generalizable solution that is helpful in alleviating the socio-technical challenges posed by the rapidly evolving landscape of generative AI.
PatternPaint: Generating Layout Patterns Using Generative AI and Inpainting Techniques
Zhou, Guanglei, Korrapati, Bhargav, Reddy, Gaurav Rajavendra, Hu, Jiang, Chen, Yiran, Thakurta, Dipto G.
Generation of VLSI layout patterns is essential for a wide range of Design For Manufacturability (DFM) studies. In this study, we investigate the potential of generative machine learning models for creating design rule legal metal layout patterns. Our results demonstrate that the proposed model can generate legal patterns in complex design rule settings and achieves a high diversity score. The designed system, with its flexible settings, supports both pattern generation with localized changes, and design rule violation correction. Our methodology is validated on Intel 18A Process Design Kit (PDK) and can produce a wide range of DRC-compliant pattern libraries with only 20 starter patterns.
A Perspective on Literary Metaphor in the Context of Generative AI
At the intersection of creative text generation and literary theory, this study explores the role of literary metaphor and its capacity to generate a range of meanings. In this regard, literary metaphor is vital to the development of any particular language. To investigate whether the inclusion of original figurative language improves textual quality, we trained an LSTM-based language model in Afrikaans. The network produces phrases containing compellingly novel figures of speech. Specifically, the emphasis falls on how AI might be utilised as a defamiliarisation technique, which disrupts expected uses of language to augment poetic expression. Providing a literary perspective on text generation, the paper raises thought-provoking questions on aesthetic value, interpretation and evaluation.
The Era of Foundation Models in Medical Imaging is Approaching : A Scoping Review of the Clinical Value of Large-Scale Generative AI Applications in Radiology
Seo, Inwoo, Bae, Eunkyoung, Jeon, Joo-Young, Yoon, Young-Sang, Cha, Jiho
Social problems stemming from the shortage of radiologists are intensifying, and artificial intelligence is being highlighted as a potential solution. Recently emerging large-scale generative AI has expanded from large language models (LLMs) to multi-modal models, showing potential to revolutionize the entire process of medical imaging. However, comprehensive reviews on their development status and future challenges are currently lacking. This scoping review systematically organizes existing literature on the clinical value of large-scale generative AI applications by following PCC guidelines. A systematic search was conducted across four databases: PubMed, EMbase, IEEE-Xplore, and Google Scholar, and 15 studies meeting the inclusion/exclusion criteria set by the researchers were reviewed. Most of these studies focused on improving the efficiency of report generation in specific parts of the interpretation process or on translating reports to aid patient understanding, with the latest studies extending to AI applications performing direct interpretations. All studies were quantitatively evaluated by clinicians, with most utilizing LLMs and only three employing multi-modal models. Both LLMs and multi-modal models showed excellent results in specific areas, but none yet outperformed radiologists in diagnostic performance. Most studies utilized GPT, with few using models specialized for the medical imaging domain. This study provides insights into the current state and limitations of large-scale generative AI-based applications in the medical imaging field, offering foundational data and suggesting that the era of medical imaging foundation models is on the horizon, which may fundamentally transform clinical practice in the near future.
Who Would Chatbots Vote For? Political Preferences of ChatGPT and Gemini in the 2024 European Union Elections
Haman, Michael, Školník, Milan
This study examines the political bias of chatbots powered by large language models, namely ChatGPT and Gemini, in the context of the 2024 European Parliament elections. The research focused on the evaluation of political parties represented in the European Parliament across 27 EU Member States by these generative artificial intelligence (AI) systems. The methodology involved daily data collection through standardized prompts on both platforms. The results revealed a stark contrast: while Gemini mostly refused to answer political questions, ChatGPT provided consistent ratings. The analysis showed a significant bias in ChatGPT in favor of left-wing and centrist parties, with the highest ratings for the Greens/European Free Alliance. In contrast, right-wing parties, particularly the Identity and Democracy group, received the lowest ratings. The study identified key factors influencing the ratings, including attitudes toward European integration and perceptions of democratic values. The findings highlight the need for a critical approach to information provided by generative AI systems in a political context and call for more transparency and regulation in this area.