Generative AI
The Taylor Swift Deepfake Saga
For all the promise of the technology, one use-case for artificial intelligence reared its ugly head last week: non-consensual pornographic images. As millions of users saw abusive A.I. generated images of Taylor Swift proliferate across X, the pitfalls of this technology became clear. If you enjoy this show, please consider signing up for Slate Plus. Slate Plus members get benefits like zero ads on any Slate podcast, bonus episodes of shows like Slow Burn and Dear Prudence--and you'll be supporting the work we do here on What Next TBD. Sign up now at slate.com/whatnextplus to help support our work.
Australian 'contemporary' portrait prize allows entries wholly generated by AI
A prestigious portrait competition has defended allowing entrants to submit artwork generated by artificial intelligence, arguing art is not stagnant and should reflect societal change. The Brisbane Portrait Prize โ with a top prize worth 50,0000 โ has been described as Queensland's answer to the Archibalds with selected entries displayed at the Brisbane Powerhouse later in the year. In the terms and conditions of entry, the Brisbane Portrait Prize notes this year that it will accept entries "completed in whole or in part by generative artificial intelligence" so long as the artwork is original and "entirely completed and owned outright" by the entrant. A spokesperson for the prize told Guardian Australia that allowing AI entries acknowledged the definition of art was not stagnant and would always grow. "BPP prides itself on being a contemporary prize and we are always interested in what'contemporary' portraiture is while fostering both the ongoing evolution of art and engaging in the surrounding conversation," they said.
How Can Generative AI Enhance the Well-being of Blind?
This paper examines the question of how generative AI can improve the well-being of blind or visually impaired people. It refers to a current example, the Be My Eyes app, in which the Be My AI feature was integrated in 2023, which is based on GPT-4 from OpenAI. The author's tests are described and evaluated. There is also an ethical and social discussion. The power of the tool, which can analyze still images in an amazing way, is demonstrated. Those affected gain a new independence and a new perception of their environment. At the same time, they are dependent on the world view and morality of the provider or developer, who prescribe or deny them certain descriptions. An outlook makes it clear that the analysis of moving images will mean a further leap forward. It is fair to say that generative AI can fundamentally improve the well-being of blind and visually impaired people and will change it in various ways.
Whispering in Norwegian: Navigating Orthographic and Dialectic Challenges
Kummervold, Per E, de la Rosa, Javier, Wetjen, Freddy, Braaten, Rolv-Arild, Solberg, Per Erik
This article introduces NB-Whisper, an adaptation of OpenAI's Whisper, specifically fine-tuned for Norwegian language Automatic Speech Recognition (ASR). We highlight its key contributions and summarise the results achieved in converting spoken Norwegian into written forms and translating other languages into Norwegian. We show that we are able to improve the Norwegian Bokm{\aa}l transcription by OpenAI Whisper Large-v3 from a WER of 10.4 to 6.6 on the Fleurs Dataset and from 6.8 to 2.2 on the NST dataset.
Nomic Embed: Training a Reproducible Long Context Text Embedder
Nussbaum, Zach, Morris, John X., Duderstadt, Brandon, Mulyar, Andriy
This technical report describes the training of nomic-embed-text-v1, the first fully reproducible, open-source, open-weights, open-data, 8192 context length English text embedding model that outperforms both OpenAI Ada-002 and OpenAI text-embedding-3-small on short and long-context tasks. We release the training code and model weights under an Apache 2 license. In contrast with other open-source models, we release a training data loader with 235 million curated text pairs that allows for the full replication of nomic-embed-text-v1. You can find code and data to replicate the model at https://github.com/nomic-ai/contrastors
Cheating Suffix: Targeted Attack to Text-To-Image Diffusion Models with Multi-Modal Priors
Yang, Dingcheng, Bai, Yang, Jia, Xiaojun, Liu, Yang, Cao, Xiaochun, Yu, Wenjian
Diffusion models have been widely deployed in various image generation tasks, demonstrating an extraordinary connection between image and text modalities. However, they face challenges of being maliciously exploited to generate harmful or sensitive images by appending a specific suffix to the original prompt. Existing works mainly focus on using single-modal information to conduct attacks, which fails to utilize multi-modal features and results in less than satisfactory performance. Integrating multi-modal priors (MMP), i.e. both text and image features, we propose a targeted attack method named MMP-Attack in this work. Specifically, the goal of MMP-Attack is to add a target object into the image content while simultaneously removing the original object. The MMP-Attack shows a notable advantage over existing works with superior universality and transferability, which can effectively attack commercial text-to-image (T2I) models such as DALL-E 3. To the best of our knowledge, this marks the first successful attempt of transfer-based attack to commercial T2I models. Our code is publicly available at \url{https://github.com/ydc123/MMP-Attack}.
Positive AI: Key Challenges in Designing Artificial Intelligence for Wellbeing
van der Maden, Willem, Lomas, Derek, Sadek, Malak, Hekkert, Paul
The rapid advancement and adoption of generative AI (GenAI) technologies like ChatGPT signify the dawn of "The Age of AI." (Gates, 2023; Kissinger, Schmidt, & Huttenlocher, 2021) These developments mark a significant leap in the capabilities and adoption of AI systems. However, for many people, the swift and disorienting integration of AI into daily life raises many issues (Cugurullo & Acheampong, 2023; Fietta, Zecchinato, Stasi, Polato, & Monaro, 2022; Qasem, 2023). Concerns include the potential impacts on employment, privacy, and inequality, along with broader societal implications like human rights, mental health, and the preservation of democratic norms (Future of Life Institute, 2023; Prabhakaran, Mitchell, Gebru, & Gabriel, 2022; Shahriari & Shahriari, 2017; Stray, 2020). This article argues for the importance of wellbeing as a key objective in AI and for human-centered design (HCD) as a key methodology. Based on this framing, it shares a set of key challenges that will face designers of AI for wellbeing, or Positive AI. The idea that AI should support wellbeing is not uncommon. In 2018, Zuckerberg (2018) (CEO of Meta, previously Facebook) publicly stated that wellbeing should be the goal of AI. Further, in an interview Jan Leike (Wiblin, n.d.) (head of the'Superalignment' research lab at OpenAI) said AI optimization should align to "flourishing."
ChatGPT is 'mildly' useful in making bioweapons: OpenAI study finds chatbot may increase accuracy and completeness of tasks for planning deadly attacks
Lawmakers and scientists have warned ChatGPT could help anyone develop deadly bioweapons that would wreck havoc on the world. While studies have suggested it is possible, new research from the chatbot's creator OpenAI claims GPT-4 - the lasted version -provides at most a mild uplift in biological threat creation accuracy. OpenAI conducted a study of 100 human participants who were separated into groups - one used the AI to craft a biotattack and the other just the internet. The study found that'GPT-4 may increase experts' ability to access information about biological threats, particularly for accuracy and completeness of tasks,' according to OpenAI's report. Results showed that the LLM group was able to obtain more information about bioweapons than the internet only group for ideation and acquisition, but more information is needed to accurately identify any potential risks.
Amazon launches Rufus, an AI-powered shopping assistant
Amazon launched a new generative AI shopping assistant, Rufus, on Thursday. The chatbot is trained on Amazon's product catalog, customer reviews, community Q&As and "information from across the web." It's only available to a limited set of Amazon customers for now but will expand in the coming weeks. The company views the assistant as customers' one-stop shop for all their shopping needs. Rufus can answer questions like, "What to consider when buying running shoes?" and display comparisons for things such as, "What are the differences between trail and road running shoes?"
Google starts a limited test of generative AI tools in Maps
Google is adding generative AI to Maps. The feature's in early access and only available for certain areas and for select Local Guides members, but it looks to be an interesting use of the technology. Basically, the tool allows you to speak to the app using natural language to discover new places in your hometown or when traveling throughout this great country of ours. Ask the app what you're looking for, like a restaurant to meet the needs of your friend group with various dietary restrictions. The company's large-language models will analyze information about more than 250 million places along with insights provided by community members as part of its Local Guides program.