Generative AI
Supporting Human-AI Collaboration in Auditing LLMs with LLMs
Rastogi, Charvi, Ribeiro, Marco Tulio, King, Nicholas, Nori, Harsha, Amershi, Saleema
Large language models are becoming increasingly pervasive and ubiquitous in society via deployment in sociotechnical systems. Yet these language models, be it for classification or generation, have been shown to be biased and behave irresponsibly, causing harm to people at scale. It is crucial to audit these language models rigorously. Existing auditing tools leverage either or both humans and AI to find failures. In this work, we draw upon literature in human-AI collaboration and sensemaking, and conduct interviews with research experts in safe and fair AI, to build upon the auditing tool: AdaTest (Ribeiro and Lundberg, 2022), which is powered by a generative large language model (LLM). Through the design process we highlight the importance of sensemaking and human-AI communication to leverage complementary strengths of humans and generative models in collaborative auditing. To evaluate the effectiveness of the augmented tool, AdaTest++, we conduct user studies with participants auditing two commercial language models: OpenAI's GPT-3 and Azure's sentiment analysis model. Qualitative analysis shows that AdaTest++ effectively leverages human strengths such as schematization, hypothesis formation and testing. Further, with our tool, participants identified a variety of failures modes, covering 26 different topics over 2 tasks, that have been shown before in formal audits and also those previously under-reported.
A 'silly' attack made ChatGPT reveal real phone numbers and email addresses
A team of researchers was able to make ChatGPT reveal some of the bits of data it has been trained on by using a simple prompt: asking the chatbot to repeat random words forever. The researchers, who work at Google DeepMind, the University of Washington, Cornell, Carnegie Mellon University, the University of California Berkeley, and ETH Zurich, urged AI companies to seek out internal and external testing before releasing large language models, the foundational tech that powers modern AI services like chatbots and image-generators. "It's wild to us that our attack works and should've, would've, could've been found earlier," they wrote, and published their findings in a paper on Tuesday that 404 Media first reported on. Chatbots like ChatGPT and prompt-based image generators like DALL-E are powered by large language models, deep learning algorithms that are trained on enormous amounts of data that critics say is often scraped off the public internet without consent. But until now, it wasn't clear what data OpenAI's chatbot was trained on since the large language models that power it are closed-source.
Talking to Chatbots Is Now a $200K Job. So I Applied.
My father was a prompt engineer, like his father before him. I come from a long line of people who toiled day and night, chatting with generative-AI chatbots. Prompt engineering is a totally new job that would have sounded crazy even a year ago. But it can pay six-figure salaries to people who extract the best results from the mysterious artificial-intelligence black boxes that are now part of daily life.
OpenAI's Custom Chatbots Are Leaking Their Secrets
You don't need to know how to code to create your own AI chatbot. Since the start of November--shortly before the chaos at the company unfolded--OpenAI has let anyone build and publish their own custom versions of ChatGPT, known as "GPTs". Thousands have been created: A "nomad" GPT gives advice about working and living remotely, another claims to search 200 million academic papers to answer your questions, and yet another will turn you into a Pixar character. However, these custom GPTs can also be forced into leaking their secrets. Security researchers and technologists probing the custom chatbots have made them spill the initial instructions they were given when they were created, and have also discovered and downloaded the files used to customize the chatbots.
Pursuing rivals, Amazon announces corporate AI chatbot
Amazon.com is rolling out a workplace chatbot called Amazon Q, designed to help corporate customers search for information, write code and review business metrics. Amazon Web Services, the retailer's cloud-computing division, is infusing generative artificial intelligence into more products, expanding its efforts to reclaim ground in a field led by its main rivals. Microsoft and Alphabet's Google have announced similar moves. Existing chatbots powered by generative AI are "genuinely super useful for consumers," AWS CEO Adam Selipsky said Tuesday at re:Invent, the company's conference in Las Vegas.
Wireless Network Digital Twin for 6G: Generative AI as A Key Enabler
Tao, Zhenyu, Xu, Wei, Huang, Yongming, Wang, Xiaoyun, You, Xiaohu
Digital twin, which enables emulation, evaluation, and optimization of physical entities through synchronized digital replicas, has gained increasingly attention as a promising technology for intricate wireless networks. For 6G, numerous innovative wireless technologies and network architectures have posed new challenges in establishing wireless network digital twins. To tackle these challenges, artificial intelligence (AI), particularly the flourishing generative AI, emerges as a potential solution. In this article, we discuss emerging prerequisites for wireless network digital twins considering the complicated network architecture, tremendous network scale, extensive coverage, and diversified application scenarios in the 6G era. We further explore the applications of generative AI, such as transformer and diffusion model, to empower the 6G digital twin from multiple perspectives including implementation, physical-digital synchronization, and slicing capability. Subsequently, we propose a hierarchical generative AI-enabled wireless network digital twin at both the message-level and policy-level, and provide a typical use case with numerical results to validate the effectiveness and efficiency. Finally, open research issues for wireless network digital twins in the 6G era are discussed.
How Generative-AI can be Effectively used in Government Chatbots
With the rapid development of artificial intelligence and breakthroughs in machine learning and natural language processing, intelligent question-answering robots have become widely used in government affairs. This paper conducts a horizontal comparison between Guangdong Province's government chatbots, ChatGPT, and Wenxin Ernie, two large language models, to analyze the strengths and weaknesses of existing government chatbots and AIGC technology. The study finds significant differences between government chatbots and large language models. China's government chatbots are still in an exploratory stage and have a gap to close to achieve "intelligence." To explore the future direction of government chatbots more deeply, this research proposes targeted optimization paths to help generative AI be effectively applied in government chatbot conversations.
LLVMs4Protest: Harnessing the Power of Large Language and Vision Models for Deciphering Protests in the News
Large language and vision models have transformed how social movements scholars identify protest and extract key protest attributes from multi-modal data such as texts, images, and videos. This article documents how we fine-tuned two large pretrained transformer models, including longformer and swin-transformer v2, to infer potential protests in news articles using textual and imagery data. First, the longformer model was fine-tuned using the Dynamic of Collective Action (DoCA) Corpus. We matched the New York Times articles with the DoCA database to obtain a training dataset for downstream tasks. Second, the swin-transformer v2 models was trained on UCLA-protest imagery data. UCLA-protest project contains labeled imagery data with information such as protest, violence, and sign. Both fine-tuned models will be available via \url{https://github.com/Joshzyj/llvms4protest}. We release this short technical report for social movement scholars who are interested in using LLVMs to infer protests in textual and imagery data.
Algorithmic Persuasion Through Simulation: Information Design in the Age of Generative AI
Harris, Keegan, Immorlica, Nicole, Lucier, Brendan, Slivkins, Aleksandrs
How can an informed sender persuade a receiver, having only limited information about the receiver's beliefs? Motivated by research showing generative AI can simulate economic agents, we initiate the study of information design with an oracle. We assume the sender can learn more about the receiver by querying this oracle, e.g., by simulating the receiver's behavior. Aside from AI motivations such as general-purpose Large Language Models (LLMs) and problem-specific machine learning models, alternate motivations include customer surveys and querying a small pool of live users. Specifically, we study Bayesian Persuasion where the sender has a second-order prior over the receiver's beliefs. After a fixed number of queries to an oracle to refine this prior, the sender commits to an information structure. Upon receiving the message, the receiver takes a payoff-relevant action maximizing her expected utility given her posterior beliefs. We design polynomial-time querying algorithms that optimize the sender's expected utility in this Bayesian Persuasion game. As a technical contribution, we show that queries form partitions of the space of receiver beliefs that can be used to quantify the sender's knowledge.
ROSO: Improving Robotic Policy Inference via Synthetic Observations
Miyashita, Yusuke, Gahtidis, Dimitris, La, Colin, Rabinowicz, Jeremy, Leitner, Jurgen
In this paper, we propose the use of generative artificial intelligence (AI) to improve zero-shot performance of a pre-trained policy by altering observations during inference. Modern robotic systems, powered by advanced neural networks, have demonstrated remarkable capabilities on pre-trained tasks. However, generalizing and adapting to new objects and environments is challenging, and fine-tuning visuomotor policies is time-consuming. To overcome these issues we propose Robotic Policy Inference via Synthetic Observations (ROSO). ROSO uses stable diffusion to pre-process a robot's observation of novel objects during inference time to fit within its distribution of observations of the pre-trained policies. This novel paradigm allows us to transfer learned knowledge from known tasks to previously unseen scenarios, enhancing the robot's adaptability without requiring lengthy fine-tuning. Our experiments show that incorporating generative AI into robotic inference significantly improves successful outcomes, finishing up to 57% of tasks otherwise unsuccessful with the pre-trained policy.