Generative AI
EnergyDiff: Universal Time-Series Energy Data Generation using Diffusion Models
Lin, Nan, Palensky, Peter, Vergara, Pedro P.
High-resolution time series data are crucial for operation and planning in energy systems such as electrical power systems and heating systems. However, due to data collection costs and privacy concerns, such data is often unavailable or insufficient for downstream tasks. Data synthesis is a potential solution for this data scarcity. With the recent development of generative AI, we propose EnergyDiff, a universal data generation framework for energy time series data. EnergyDiff builds on state-of-the-art denoising diffusion probabilistic models, utilizing a proposed denoising network dedicated to high-resolution time series data and introducing a novel Marginal Calibration technique. Our extensive experimental results demonstrate that EnergyDiff achieves significant improvement in capturing temporal dependencies and marginal distributions compared to baselines, particularly at the 1-minute resolution. Additionally, EnergyDiff consistently generates high-quality time series data across diverse energy domains, time resolutions, and at both customer and transformer levels with reduced computational need.
PASTA: Controllable Part-Aware Shape Generation with Autoregressive Transformers
Li, Songlin, Paschalidou, Despoina, Guibas, Leonidas
The increased demand for tools that automate the 3D content creation process led to tremendous progress in deep generative models that can generate diverse 3D objects of high fidelity. In this paper, we present PASTA, an autoregressive transformer architecture for generating high quality 3D shapes. PASTA comprises two main components: An autoregressive transformer that generates objects as a sequence of cuboidal primitives and a blending network, implemented with a transformer decoder that composes the sequences of cuboids and synthesizes high quality meshes for each object. Our model is trained in two stages: First we train our autoregressive generative model using only annotated cuboidal parts as supervision and next, we train our blending network using explicit 3D supervision, in the form of watertight meshes. Evaluations on various ShapeNet objects showcase the ability of our model to perform shape generation from diverse inputs \eg from scratch, from a partial object, from text and images, as well size-guided generation, by explicitly conditioning on a bounding box that defines the object's boundaries. Moreover, as our model considers the underlying part-based structure of a 3D object, we are able to select a specific part and produce shapes with meaningful variations of this part. As evidenced by our experiments, our model generates 3D shapes that are both more realistic and diverse than existing part-based and non part-based methods, while at the same time is simpler to implement and train.
Japan news media association demands consent and accuracy from generative AI
Japan's news industry association issued a statement Wednesday demanding providers of generative artificial intelligence services obtain permits from member media organizations to use their news content and ensure accuracy. The Japan Newspaper Publishers and Editors Association, whose members also include broadcasters, said in the statement that generative AI service providers have expanded their businesses in defiance of the association's repeated requests for them to gain permission. In the RAG services, AI answers in a written form questions asked by users by digging related information out of online sources. Sometimes generated answers are identical with original news stories, or sometimes they are inaccurate due to inappropriate diversion and processing of such original content, the association noted, adding that another problem is that AI does not correct wrong answers. Unless such "freeriding" of content is regulated, media organizations' content will die out, causing irreversible harm to the foundation of democracy and national culture, it warned, urging the government to promptly review laws on intellectual properties.
OpenAI Touts New AI Safety Research. Critics Say It's a Good Step, but Not Enough
OpenAI has faced opprobrium in recent months from those who suggest it may be rushing too quickly and recklessly to develop more powerful artificial intelligence. The company appears intent on showing it takes AI safety seriously. Today it showcased research that it says could help researchers scrutinize AI models even as they become more capable and useful. The new technique is one of several ideas related to AI safety that the company has touted in recent weeks. It involves having two AI models engage in a conversation that forces the more powerful one to be more transparent, or "legible," with its reasoning so that humans can understand what it's up to.
Microsoft Designer and its AI art will soon land on your PC
Microsoft Designer's AI art and editing capabilities are becoming more formally integrated into Windows and Microsoft's services today, as they move into Photos, Word, and PowerPoint. Designer is the next stage of Microsoft's evolution of AI art, which began in 2022 with Bing Image Creator, migrated to the more advanced Dall-E 2 model, then became part of Microsoft Designer, the wonderful AI-powered design tool that debuted in 2022. Designer's layout elements compete directly with Canva, but Microsoft isn't confining the Designer elements to just the app. The most simple integration is within Word and PowerPoint. You'll need a Copilot Pro subscription, but if you have one, you'll be able to use AI to generate a background for a PowerPoint slide or an integrated graphic inside of a Word document.
Matryoshka-Adaptor: Unsupervised and Supervised Tuning for Smaller Embedding Dimensions
Yoon, Jinsung, Sinha, Raj, Arik, Sercan O, Pfister, Tomas
Embeddings from Large Language Models (LLMs) have emerged as critical components in various applications, particularly for information retrieval. While high-dimensional embeddings generally demonstrate superior performance as they contain more salient information, their practical application is frequently hindered by elevated computational latency and the associated higher cost. To address these challenges, we propose Matryoshka-Adaptor, a novel tuning framework designed for the customization of LLM embeddings. Matryoshka-Adaptor facilitates substantial dimensionality reduction while maintaining comparable performance levels, thereby achieving a significant enhancement in computational efficiency and cost-effectiveness. Our framework directly modifies the embeddings from pre-trained LLMs which is designed to be seamlessly integrated with any LLM architecture, encompassing those accessible exclusively through black-box APIs. Also, it exhibits efficacy in both unsupervised and supervised learning settings. A rigorous evaluation conducted across a diverse corpus of English, multilingual, and multimodal datasets consistently reveals substantial gains with Matryoshka-Adaptor. Notably, with Google and OpenAI Embedding APIs, Matryoshka-Adaptor achieves a reduction in dimensionality ranging from two- to twelve-fold without compromising performance across multiple BEIR datasets.
From Principles to Practices: Lessons Learned from Applying Partnership on AI's (PAI) Synthetic Media Framework to 11 Use Cases
Leibowicz, Claire R., Cardona, Christian H.
2023 was the year the world woke up to generative AI, and 2024 is the year policymakers are responding more firmly. Importantly, this policy momentum is taking place alongside real world creation and distribution of synthetic media. Social media platforms, news organizations, dating apps, image generation companies, and more are already navigating a world of AI-generated visuals and sounds, already changing hearts and minds, as policymakers try to catch up. How, then, can AI governance capture the complexity of the synthetic media landscape? How can it attend to synthetic media's myriad uses, ranging from storytelling to privacy preservation, to deception, fraud, and defamation, taking into account the many stakeholders involved in its development, creation, and distribution? And what might it mean to govern synthetic media in a manner that upholds the truth while bolstering freedom of expression? What follows is the first known collection of diverse examples of the implementation of synthetic media governance that responds to these questions, specifically through Partnership on AI's (PAI) Responsible Practices for Synthetic Media - a voluntary, normative Framework for creating, distributing, and building technology for synthetic media responsibly, launched in February 2023. In this paper, we present a case bank of real world examples that help operationalize the Framework - highlighting areas synthetic media governance can be applied, augmented, expanded, and refined for use, in practice. Read together, the cases emphasize distinct elements of AI policymaking and seven emergent best practices supporting transparency, safety, expression, and digital dignity online: consent, disclosure, and differentiation between harmful and creative use cases.
Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization
Majumder, Navonil, Hung, Chia-Yu, Ghosal, Deepanway, Hsu, Wei-Ning, Mihalcea, Rada, Poria, Soujanya
Generative multimodal content is increasingly prevalent in much of the content creation arena, as it has the potential to allow artists and media personnel to create pre-production mockups by quickly bringing their ideas to life. The generation of audio from text prompts is an important aspect of such processes in the music and film industry. Many of the recent diffusion-based text-to-audio models focus on training increasingly sophisticated diffusion models on a large set of datasets of prompt-audio pairs. These models do not explicitly focus on the presence of concepts or events and their temporal ordering in the output audio with respect to the input prompt. Our hypothesis is focusing on how these aspects of audio generation could improve audio generation performance in the presence of limited data. As such, in this work, using an existing text-to-audio model Tango, we synthetically create a preference dataset where each prompt has a winner audio output and some loser audio outputs for the diffusion model to learn from. The loser outputs, in theory, have some concepts from the prompt missing or in an incorrect order. We fine-tune the publicly available Tango text-to-audio model using diffusion-DPO (direct preference optimization) loss on our preference dataset and show that it leads to improved audio output over Tango and AudioLDM2, in terms of both automatic- and manual-evaluation metrics.
In the Age of A.I., How Much Is Silicon Valley Prepared to Give Back?
For the last couple of years, the tech community has tested no-strings-attached payments of 500 or 1,000 a month to those in dire need. Some of these experiments have happened in the heart of Silicon Valley, where a one-bedroom apartment rents for 3,000 a month and a modest house is often an unaffordable luxury. Silicon Valley's backing of these efforts has propelled the idea of a guaranteed income -- also known as cash transfers, unconditional cash and, in its most utopian form, universal basic income -- into the mainstream. But a bipartisan political consensus around the movement is fracturing even though the data seems to show that the programs are effective. In recent months, the Texas attorney general went to court to prevent public funds from being used in a basic income program in Houston.
Hong Kong Testing ChatGPT-Style Tool After OpenAI Took Steps to Block Access
Hong Kong's government is testing the city's own ChatGPT -style tool for its employees, with plans to eventually make it available to the public, its innovation minister said after OpenAI took extra steps to block access from the city and other unsupported regions. Secretary for Innovation, Technology and Industry Sun Dong said on a Saturday radio show that his bureau was trying out the artificial intelligence program, whose Chinese name translates to "document assistance application for civil servants," to further improve its capabilities. He plans to have it available for the rest of the government this year. The program was developed by a generative AI research and development center led by the Hong Kong University of Science and Technology in collaboration with several other universities. Sun said the model would provide functions like graphics and video design in the future.