Generative AI
Adobe adds generative AI editing to Photoshop
As generative AI has taken the tech world by storm, it was only a matter of time before Photoshop got in on the action. Adobe announced today that a new Generative Fill feature is coming to its ubiquitous photo-editing software later this year. The company promises "a magical new way to work" as the Firefly-powered feature lets you add, remove and extend visual content based on natural-language text prompts. "Generative Fill combines the speed and ease of generative AI with the power and precision of Photoshop, empowering customers to bring their visions to life at the speed of their imaginations," said Ashley Still, Adobe's senior VP of Digital Media. Adobe's Generative Fill is equivalent to DALL-E 2's inpainting (generating AI content within a section of an image) and outpainting (AI-generated content extending beyond the image's borders).
Suddenly, everyone wants to talk about how to regulate AI
Last week, OpenAI CEO Sam Altman appeared before a US Senate committee to talk about the risks and potential of AI language models. Altman, along with many senators, called for international standards for artificial intelligence. He also urged the US to regulate the technology and set up a new agency, much like the Food and Drug Administration, to regulate AI. For an AI policy nerd like myself, the Senate hearing was both encouraging and frustrating. Encouraging because the conversation seems to have moved past promoting wishy-washy self-regulation and on to rules that could actually hold companies accountable.
Fake Pentagon explosion photo goes viral: How to spot an AI image
A fake image appearing to show a large explosion near the Pentagon was shared on social media on Monday prompting a brief dip in the stock market. Within minutes, a wave of social media accounts including some verified accounts shared the fake picture, further amplifying the confusion. Officials later confirmed that no such incident had occurred. Confident that this picture claiming to show an "explosion near the pentagon" is AI generated. Check out the frontage of the building, and the way the fence melds into the crowd barriers.
Deep Generative Model for Simultaneous Range Error Mitigation and Environment Identification
Li, Yuxiao, Mazuelas, Santiago, Shen, Yuan
Received waveforms contain rich information for both range information and environment semantics. However, its full potential is hard to exploit under multipath and non-line-of-sight conditions. This paper proposes a deep generative model (DGM) for simultaneous range error mitigation and environment identification. In particular, we present a Bayesian model for the generative process of the received waveform composed by latent variables for both range-related features and environment semantics. The simultaneous range error mitigation and environment identification is interpreted as an inference problem based on the DGM, and implemented in a unique end-to-end learning scheme. Comprehensive experiments on a general Ultra-wideband dataset demonstrate the superior performance on range error mitigation, scalability to different environments, and novel capability on simultaneous environment identification.
Do All Languages Cost the Same? Tokenization in the Era of Commercial Language Models
Ahia, Orevaoghene, Kumar, Sachin, Gonen, Hila, Kasai, Jungo, Mortensen, David R., Smith, Noah A., Tsvetkov, Yulia
Language models have graduated from being research prototypes to commercialized products offered as web APIs, and recent works have highlighted the multilingual capabilities of these products. The API vendors charge their users based on usage, more specifically on the number of ``tokens'' processed or generated by the underlying language models. What constitutes a token, however, is training data and model dependent with a large variance in the number of tokens required to convey the same information in different languages. In this work, we analyze the effect of this non-uniformity on the fairness of an API's pricing policy across languages. We conduct a systematic analysis of the cost and utility of OpenAI's language model API on multilingual benchmarks in 22 typologically diverse languages. We show evidence that speakers of a large number of the supported languages are overcharged while obtaining poorer results. These speakers tend to also come from regions where the APIs are less affordable to begin with. Through these analyses, we aim to increase transparency around language model APIs' pricing policies and encourage the vendors to make them more equitable.
AI went to Washington and here's what you need to know about this mind-blowing technology
CEO says OpenAI CEO Sam Altman said language and cultural inclusivity is "very important" to his company's mission as it builds and trains powerful artificial intelligence systems. On Tuesday, May 16, Mr. Altman went to Washington. And today, the world feels a little scarier. There's rarely a day when we don't hear some new report about the groundbreaking impact – and potential danger – of this technology. Large learning models like ChatGPT have caught the world by surprise based on the speed of their learning and what they are now able to do.
Fears of AI hitting black market stir concerns of criminals evading government regulations: Expert
Dr. Harvey Castro said he's less concerned about AI being developed by big corporations because there are safeguards, but it can be created without safeguards and sold. Artificial intelligence – specifically large language models like ChatGPT – can theoretically give criminals information needed to cover their tracks before and after a crime, then erase that evidence, an expert warns. Large language models, or LLMs, make up a segment of AI technology that uses algorithms that can recognize, summarize, translate, predict and generate text and other content based on knowledge gained from massive datasets. ChatGPT is the most well known LLM, and its successful, rapid development has created unease among some experts and sparked a Senate hearing to hear from Sam Altman, the CEO of ChatGPT maker OpenAI, who pushed for oversight. Corporations like Google and Microsoft are developing AI at a fast pace. But when it comes to crime, that's not what scares Dr. Harvey Castro, a board-certified emergency medicine physician and national speaker on artificial intelligence who created his own LLM called "Sherlock."
Generative AI: Implications and Applications for Education
Olga, Anastasia, Tzirides, null, Saini, Akash, Zapata, Gabriela, Searsmith, Duane, Cope, Bill, Kalantzis, Mary, Castro, Vania, Kourkoulou, Theodora, Jones, John, da Silva, Rodrigo Abrantes, Whiting, Jen, Kastania, Nikoleta Polyxeni
The launch of ChatGPT in November 2022 precipitated a panic among some educators while prompting qualified enthusiasm from others. Under the umbrella term Generative AI, ChatGPT is an example of a range of technologies for the delivery of computer-generated text, image, and other digitized media. This paper examines the implications for education of one generative AI technology, chatbots responding from large language models, or C-LLM. It reports on an application of a C-LLM to AI review and assessment of complex student work. In a concluding discussion, the paper explores the intrinsic limits of generative AI, bound as it is to language corpora and their textual representation through binary notation. Within these limits, we suggest the range of emerging and potential applications of Generative AI in education.
Adversarial Nibbler: A Data-Centric Challenge for Improving the Safety of Text-to-Image Models
Parrish, Alicia, Kirk, Hannah Rose, Quaye, Jessica, Rastogi, Charvi, Bartolo, Max, Inel, Oana, Ciro, Juan, Mosquera, Rafael, Howard, Addison, Cukierski, Will, Sculley, D., Reddi, Vijay Janapa, Aroyo, Lora
The generative AI revolution in recent years has been spurred by an expansion in compute power and data quantity, which together enable extensive pre-training of powerful text-to-image (T2I) models. With their greater capabilities to generate realistic and creative content, these T2I models like DALL-E, MidJourney, Imagen or Stable Diffusion are reaching ever wider audiences. Any unsafe behaviors inherited from pretraining on uncurated internet-scraped datasets thus have the potential to cause wide-reaching harm, for example, through generated images which are violent, sexually explicit, or contain biased and derogatory stereotypes. Despite this risk of harm, we lack systematic and structured evaluation datasets to scrutinize model behavior, especially adversarial attacks that bypass existing safety filters. A typical bottleneck in safety evaluation is achieving a wide coverage of different types of challenging examples in the evaluation set, i.e., identifying 'unknown unknowns' or long-tail problems. To address this need, we introduce the Adversarial Nibbler challenge. The goal of this challenge is to crowdsource a diverse set of failure modes and reward challenge participants for successfully finding safety vulnerabilities in current state-of-the-art T2I models. Ultimately, we aim to provide greater awareness of these issues and assist developers in improving the future safety and reliability of generative AI models. Adversarial Nibbler is a data-centric challenge, part of the DataPerf challenge suite, organized and supported by Kaggle and MLCommons.
Observations on LLMs for Telecom Domain: Capabilities and Limitations
The landscape for building conversational interfaces (chatbots) has witnessed a paradigm shift with recent developments in generative Artificial Intelligence (AI) based Large Language Models (LLMs), such as ChatGPT by OpenAI (GPT3.5 and GPT4), Google's Bard, Large Language Model Meta AI (LLaMA), among others. In this paper, we analyze capabilities and limitations of incorporating such models in conversational interfaces for the telecommunication domain, specifically for enterprise wireless products and services. Using Cradlepoint's publicly available data for our experiments, we present a comparative analysis of the responses from such models for multiple use-cases including domain adaptation for terminology and product taxonomy, context continuity, robustness to input perturbations and errors. We believe this evaluation would provide useful insights to data scientists engaged in building customized conversational interfaces for domain-specific requirements.