Goto

Collaborating Authors

 Generative AI


Silicon valley's bet on the device that comes after the smartphone

The Japan Times

Inside a former horse stable in the San Francisco neighborhood of SoMa, a wave of gentle chirps emerged from small, blinking devices pinned to the chests of employees at a startup called Humane. It was just weeks before the Ai Pin would be revealed to the world -- a culmination of five years, $240 million in funding, 25 patents, a steady drumbeat of hype and partnerships with top tech companies, including OpenAI, Microsoft and Salesforce.


Diff-GO: Diffusion Goal-Oriented Communications to Achieve Ultra-High Spectrum Efficiency

arXiv.org Artificial Intelligence

The latest advances in artificial intelligence (AI) present many unprecedented opportunities to achieve much improved bandwidth saving in communications. Unlike conventional communication systems focusing on packet transport, rich datasets and AI makes it possible to efficiently transfer only the information most critical to the goals of message recipients. One of the most exciting advances in generative AI known as diffusion model presents a unique opportunity for designing ultra-fast communication systems well beyond language-based messages. This work presents an ultra-efficient communication design by utilizing generative AI-based on diffusion models as a specific example of the general goal-oriented communication framework. To better control the regenerated message at the receiver output, our diffusion system design includes a local regeneration module with finite dimensional noise latent. The critical significance of noise latent control and sharing residing on our Diff-GO is the ability to introduce the concept of "local generative feedback" (Local-GF), which enables the transmitter to monitor the quality and gauge the quality or accuracy of the message recovery at the semantic system receiver. To this end, we propose a new low-dimensional noise space for the training of diffusion models, which significantly reduces the communication overhead and achieves satisfactory message recovery performance. Our experimental results demonstrate that the proposed noise space and the diffusion-based generative model achieve ultra-high spectrum efficiency and accurate recovery of transmitted image signals. By trading off computation for bandwidth efficiency (C4BE), this new framework provides an important avenue to achieve exceptional computation-bandwidth tradeoff.


ShipGen: A Diffusion Model for Parametric Ship Hull Generation with Multiple Objectives and Constraints

arXiv.org Artificial Intelligence

Ship design is a years-long process that requires balancing complex design trade-offs to create a ship that is efficient and effective. Finding new ways to improve the ship design process can lead to significant cost savings for ship building and operation. One promising technology is generative artificial intelligence, which has been shown to reduce design cycle time and create novel, high-performing designs. In literature review, generative artificial intelligence has been shown to generate ship hulls; however, ship design is particularly difficult as the hull of a ship requires the consideration of many objectives. This paper presents a study on the generation of parametric ship hull designs using a parametric diffusion model that considers multiple objectives and constraints for the hulls. This denoising diffusion probabilistic model (DDPM) generates the tabular parametric design vectors of a ship hull for evaluation. In addition to a tabular DDPM, this paper details adding guidance to improve the quality of generated ship hull designs. By leveraging classifier guidance, the DDPM produced feasible parametric ship hulls that maintain the coverage of the initial training dataset of ship hulls with a 99.5% rate, a 149x improvement over random sampling of the design vector parameters across the design space. Parametric ship hulls produced with performance guidance saw an average of 91.4% reduction in wave drag coefficients and an average of a 47.9x relative increase in the total displaced volume of the hulls compared to the mean performance of the hulls in the training dataset. The use of a DDPM to generate parametric ship hulls can reduce design time by generating high-performing hull designs for future analysis. These generated hulls have low drag and high volume, which can reduce the cost of operating a ship and increase its potential to generate revenue.


Taming Diffusion Models for Music-driven Conducting Motion Generation

arXiv.org Artificial Intelligence

Generating the motion of orchestral conductors from a given piece of symphony music is a challenging task since it requires a model to learn semantic music features and capture the underlying distribution of real conducting motion. Prior works have applied Generative Adversarial Networks (GAN) to this task, but the promising diffusion model, which recently showed its advantages in terms of both training stability and output quality, has not been exploited in this context. This paper presents Diffusion-Conductor, a novel DDIM-based approach for music-driven conducting motion generation, which integrates the diffusion model to a two-stage learning framework. We further propose a random masking strategy to improve the feature robustness, and use a pair of geometric loss functions to impose additional regularizations and increase motion diversity. We also design several novel metrics, including Frechet Gesture Distance (FGD) and Beat Consistency Score (BC) for a more comprehensive evaluation of the generated motion. Experimental results demonstrate the advantages of our model.


A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity

arXiv.org Artificial Intelligence

The strong performance (Chowdhery et al., 2022; Nostalgebraist, 2022; OpenAI, 2023; Google, 2023), and emergent abilities (Wei et al., 2022) of modern language models (LMs) depend on self-supervised pretraining on massive text datasets. All model developers implicitly or explicitly decide the composition of these datasets: what data sources to include, whether to filter for attributes such as quality and toxicity, and when to gather new documents. While many of the most prominent models do not document their curation procedures (OpenAI, 2023; Google, 2023), or only document which procedures they used (Brown et al., 2020; Nostalgebraist, 2022; Scao et al., 2022; Touvron et al., 2023), they rarely document why they chose those protocols or what effect they had. This documentation debt leaves practitioners to be guided by intuitions and precedents, neither thoroughly evaluated (Bandy and Vincent, 2021; Sambasivan et al., 2021). Given the outsized and fundamental role of pretraining data in modern LMs, we believe this neglectful practice has detracted from responsible data use and hampered effective model development (Rogers, 2021; Gebru et al., 2021; Bender and Friedman, 2018). Among the small number of general-purpose LMs dominating community use and discussion, the prevailing focus has been on the scale of pretraining data and number of optimization steps (Brown et al., 2020; Nostalgebraist, 2022; Google, 2023). In this work, we systematically test how common data design decisions affect model performance--specifically: the time of collection, content filtering strategy (toxicity/quality), and domain composition. We study the impacts in two ways.


The Impact of Generative Artificial Intelligence

arXiv.org Artificial Intelligence

The rise of generative artificial intelligence (AI) has sparked concerns about its potential influence on unemployment and market depression. This study addresses this concern by examining the impact of generative AI on product markets. To overcome the challenge of causal inference, given the inherent limitations of conducting controlled experiments, this paper identifies an unanticipated and sudden leak of a highly proficient image-generative AI as a novel instance of a "natural experiment". This AI leak spread rapidly, significantly reducing the cost of generating anime-style images compared to other styles, creating an opportunity for comparative assessment. We collect real-world data from an artwork outsourcing platform. Surprisingly, our results show that while generative AI lowers average prices, it substantially boosts order volume and overall revenue. This counterintuitive finding suggests that generative AI confers benefits upon artists rather than detriments. The study further offers theoretical economic explanations to elucidate this unexpected phenomenon. By furnishing empirical evidence, this paper dispels the notion that generative AI might engender depression, instead underscoring its potential to foster market prosperity. These findings carry significant implications for practitioners, policymakers, and the broader AI community.


BeautifulPrompt: Towards Automatic Prompt Engineering for Text-to-Image Synthesis

arXiv.org Artificial Intelligence

Recently, diffusion-based deep generative models (e.g., Stable Diffusion) have shown impressive results in text-to-image synthesis. However, current text-to-image models often require multiple passes of prompt engineering by humans in order to produce satisfactory results for real-world applications. We propose BeautifulPrompt, a deep generative model to produce high-quality prompts from very simple raw descriptions, which enables diffusion-based models to generate more beautiful images. In our work, we first fine-tuned the BeautifulPrompt model over low-quality and high-quality collecting prompt pairs. Then, to ensure that our generated prompts can generate more beautiful images, we further propose a Reinforcement Learning with Visual AI Feedback technique to fine-tune our model to maximize the reward values of the generated prompts, where the reward values are calculated based on the PickScore and the Aesthetic Scores. Our results demonstrate that learning from visual AI feedback promises the potential to improve the quality of generated prompts and images significantly. We further showcase the integration of BeautifulPrompt to a cloud-native AI platform to provide better text-to-image generation service in the cloud.


Signal Is Finally Testing Usernames

WIRED

Drones, hidden cameras, thermal vision scopes--these are just a few examples of the high-tech equipment recommended by the animal liberation group Direct Action Everywhere, according to a manual released by the organization this week. The document, which was reviewed by WIRED, is a rare glimpse into how the organization is using tech to target factory farms in often brazen operations that have rescued pigs, goats, ducks, and chickens. Extremist groups are experimenting with generative AI to flood social media with propaganda and misinformation, researchers at Tech Against Terrorism have told WIRED. A new report from the group details how, in recent months, terrorists and other extremist organizations have been using artificial intelligence to manipulate imagery and thwart content moderation. As platforms have struggled to keep up with this flood of extremist content, a new tool called Altitude, built in collaboration between Tech Against Terrorism and Google, is seeking to address the problem.


Is Anything Still True? On the Internet, No One Knows Anymore

WSJ.com: WSJD - Technology

Creating and disseminating convincing propaganda used to require the resources of a state. Now all it takes is a smartphone. Generative artificial intelligence is now capable of creating fake pictures, clones of our voices, and even videos depicting and distorting world events. The result: From our personal circles to the political circuses, everyone must now question whether what they see and hear is true.


Microsoft briefly blocked employees from using ChatGPT over security concerns

Engadget

Microsoft temporarily prohibited its employees from using ChatGPT "due to security and data concerns," according to CNBC. The company announced the rule in an internal website and even blocked corporate devices from being able to access the AI chatbot. While several tech companies had prohibited -- or had at least discouraged -- the internal use of ChatGPT in the past, Microsoft doing the same thing was certainly curious, seeing as it's OpenAI's biggest and most prominent investor. In January, Microsoft pledged to invest $10 billion in ChatGPT's developer over the next few years after pouring $3 billion into the company in the past. The AI-powered tools it rolled out for its products, such as Bing's chatbot, also use OpenAI's large language model.