Generative AI
A Survey of AI Reliance
Eckhardt, Sven, Kรผhl, Niklas, Dolata, Mateusz, Schwabe, Gerhard
Artificial intelligence (AI) systems have become an indispensable component of modern technology. However, research on human behavioral responses is lagging behind, i.e., the research into human reliance on AI advice (AI reliance). Current shortcomings in the literature include the unclear influences on AI reliance, lack of external validity, conflicting approaches to measuring reliance, and disregard for a change in reliance over time. Promising avenues for future research include reliance on generative AI output and reliance in multi-user situations. In conclusion, we present a morphological box that serves as a guide for research on AI reliance.
Stretching Each Dollar: Diffusion Training from Scratch on a Micro-Budget
Sehwag, Vikash, Kong, Xianghao, Li, Jingtao, Spranger, Michael, Lyu, Lingjuan
As scaling laws in generative AI push performance, they also simultaneously concentrate the development of these models among actors with large computational resources. With a focus on text-to-image (T2I) generative models, we aim to address this bottleneck by demonstrating very low-cost training of large-scale T2I diffusion transformer models. As the computational cost of transformers increases with the number of patches in each image, we propose to randomly mask up to 75% of the image patches during training. We propose a deferred masking strategy that preprocesses all patches using a patch-mixer before masking, thus significantly reducing the performance degradation with masking, making it superior to model downscaling in reducing computational cost. We also incorporate the latest improvements in transformer architecture, such as the use of mixture-of-experts layers, to improve performance and further identify the critical benefit of using synthetic images in micro-budget training. Finally, using only 37M publicly available real and synthetic images, we train a 1.16 billion parameter sparse transformer with only \$1,890 economical cost and achieve a 12.7 FID in zero-shot generation on the COCO dataset. Notably, our model achieves competitive FID and high-quality generations while incurring 118$\times$ lower cost than stable diffusion models and 14$\times$ lower cost than the current state-of-the-art approach that costs \$28,400. We aim to release our end-to-end training pipeline to further democratize the training of large-scale diffusion models on micro-budgets.
Universal Spectral Transfer with Physical Prior-Informed Deep Generative Learning
Spectroscopy is a powerful analytical technique for characterizing matter across physical and biological realms1-5. However, its fundamental principle necessitates specialized instrumentation per physical phenomena probed, limiting broad adoption and use in all relevant research. In this study, we introduce SpectroGen, a novel physical prior-informed deep generative model for generating relevant spectral signatures across modalities using experimentally collected spectral input only from a single modality. We achieve this by reimagining the representation of spectral data as mathematical constructs of distributions instead of their traditional physical and molecular state representations. The results from 319 standard mineral samples tested demonstrate generating with 99% correlation and 0.01 root mean square error with superior resolution than experimentally acquired ground truth spectra. We showed transferring capability across Raman, Infrared, and X-ray Diffraction modalities with Gaussian, Lorentzian, and Voigt distribution priors respectively6-10. This approach however is globally generalizable for any spectral input that can be represented by a distribution prior, making it universally applicable. We believe our work revolutionizes the application sphere of spectroscopy, which has traditionally been limited by access to the required sophisticated and often expensive equipment towards accelerating material, pharmaceutical, and biological discoveries.
No Size Fits All: The Perils and Pitfalls of Leveraging LLMs Vary with Company Size
Urlana, Ashok, Kumar, Charaka Vinayak, Garlapati, Bala Mallikarjunarao, Singh, Ajeet Kumar, Mishra, Rahul
Large language models (LLMs) are playing a pivotal role in deploying strategic use cases across a range of organizations, from large pan-continental companies to emerging startups. The issues and challenges involved in the successful utilization of LLMs can vary significantly depending on the size of the organization. It is important to study and discuss these pertinent issues of LLM adaptation with a focus on the scale of the industrial concerns and brainstorm possible solutions and prospective directions. Such a study has not been prominently featured in the current research literature. In this study, we adopt a threefold strategy: first, we conduct a case study with industry practitioners to formulate the key research questions; second, we examine existing industrial publications to address these questions; and finally, we provide a practical guide for industries to utilize LLMs more efficiently.
Arondight: Red Teaming Large Vision Language Models with Auto-generated Multi-modal Jailbreak Prompts
Liu, Yi, Cai, Chengjun, Zhang, Xiaoli, Yuan, Xingliang, Wang, Cong
Large Vision Language Models (VLMs) extend and enhance the perceptual abilities of Large Language Models (LLMs). Despite offering new possibilities for LLM applications, these advancements raise significant security and ethical concerns, particularly regarding the generation of harmful content. While LLMs have undergone extensive security evaluations with the aid of red teaming frameworks, VLMs currently lack a well-developed one. To fill this gap, we introduce Arondight, a standardized red team framework tailored specifically for VLMs. Arondight is dedicated to resolving issues related to the absence of visual modality and inadequate diversity encountered when transitioning existing red teaming methodologies from LLMs to VLMs. Our framework features an automated multi-modal jailbreak attack, wherein visual jailbreak prompts are produced by a red team VLM, and textual prompts are generated by a red team LLM guided by a reinforcement learning agent. To enhance the comprehensiveness of VLM security evaluation, we integrate entropy bonuses and novelty reward metrics. These elements incentivize the RL agent to guide the red team LLM in creating a wider array of diverse and previously unseen test cases. Our evaluation of ten cutting-edge VLMs exposes significant security vulnerabilities, particularly in generating toxic images and aligning multi-modal prompts. In particular, our Arondight achieves an average attack success rate of 84.5\% on GPT-4 in all fourteen prohibited scenarios defined by OpenAI in terms of generating toxic text. For a clearer comparison, we also categorize existing VLMs based on their safety levels and provide corresponding reinforcement recommendations. Our multimodal prompt dataset and red team code will be released after ethics committee approval. CONTENT WARNING: THIS PAPER CONTAINS HARMFUL MODEL RESPONSES.
Scarlett Johansson refused OpenAI job because 'it would be strange' for her kids, 'against my core values'
Scarlett Johansson is speaking out about the reasons she turned down the job of voicing OpenAI's chatbot. Last year, OpenAI CEO Sam Altman reached out to the 39-year-old actress about potentially hiring her to voice the ChatGPT 4.0 system. In an interview with The New York Times, Johansson, who voiced the character of Samantha, an artificial intelligence virtual assistant in the 2013 film "Her," recalled that she said, "No, thank you. Not for me," when Altman approached her about the gig. "I felt I did not want to be at the forefront of that," Johansson told the Times.
CVE-LLM : Automatic vulnerability evaluation in medical device industry using large language models
Ghosh, Rikhiya, Farri, Oladimeji, von Stockhausen, Hans-Martin, Schmitt, Martin, Vasile, George Marica
The healthcare industry is currently experiencing an unprecedented wave of cybersecurity attacks, impacting millions of individuals. With the discovery of thousands of vulnerabilities each month, there is a pressing need to drive the automation of vulnerability assessment processes for medical devices, facilitating rapid mitigation efforts. Generative AI systems have revolutionized various industries, offering unparalleled opportunities for automation and increased efficiency. This paper presents a solution leveraging Large Language Models (LLMs) to learn from historical evaluations of vulnerabilities for the automatic assessment of vulnerabilities in the medical devices industry. This approach is applied within the portfolio of a single manufacturer, taking into account device characteristics, including existing security posture and controls. The primary contributions of this paper are threefold. Firstly, it provides a detailed examination of the best practices for training a vulnerability Language Model (LM) in an industrial context. Secondly, it presents a comprehensive comparison and insightful analysis of the effectiveness of Language Models in vulnerability assessment. Finally, it proposes a new human-in-the-loop framework to expedite vulnerability evaluation processes.
OpenAI's new, lightweight GPT-4o mini model promises an improved ChatGPT experience
OpenAI on Thursday released a smaller and more affordable version of its flagship large language model that powers ChatGPT. The new model, called GPT-4o mini, will reportedly cost developers 60 percent less to build AI-powered apps and services with as compared to GPT-3.5 Turbo, Open's smallest model until today. But the big news here is for consumers. GPT-4o mini will replace GPT-3.5 Turbo for free users of ChatGPT starting today -- which means that your baseline ChatGPT experience will improve significantly. OpenAI claimed that GPT-4o mini achieved an 82 percent score on an industry benchmark called the MMLU, which stands for Measuring Massive Multitask Language Understanding, and includes 16,000 multiple-choice questions across 57 academic subjects.
OpenAI Slashes the Cost of Using Its AI With a "Mini" Model
OpenAI today announced a cut-price "mini" model that it says will allow more companies and programs to tap into its artificial intelligence. The new model, called GPT-4o mini and available starting today, is 60 percent cheaper than OpenAI's most inexpensive existing model while offering higher performance, the company says. OpenAI characterizes the move as part of an effort to make AI "as broadly accessible as possible," but it also reflects growing competition among AI cloud providers as well as rising interest in small and free open source AI models. Meta is expected to debut the largest version of its very capable free offering, Llama 3, next week. "The whole point of OpenAI is to build and distribute AI safely and make it broadly accessible," Olivier Godement, a product manager at OpenAI responsible for the new model, tells WIRED.
Generative AI Augmented Induction-based Formal Verification
Kumar, Aman, Gadde, Deepak Narayan
Generative Artificial Intelligence (GenAI) has demonstrated its capabilities in the present world that reduce human effort significantly. It utilizes deep learning techniques to create original and realistic content in terms of text, images, code, music, and video. Researchers have also shown the capabilities of modern Large Language Models (LLMs) used by GenAI models that can be used to aid hardware development. Formal verification is a mathematical-based proof method used to exhaustively verify the correctness of a design. In this paper, we demonstrate how GenAI can be used in induction-based formal verification to increase the verification throughput.