Goto

Collaborating Authors

 Generative AI


The Near Future of Deepfakes Just Got Way Clearer

The Atlantic - Technology

Before the start of India's general election in April, a top candidate looking to unseat Prime Minister Narendra Modi was not out wooing voters on the campaign trail. Arvind Kejriwal, the chief minister of Delhi and the head of a political party known for its anti-corruption platform, was arrested in late March for, yes, alleged corruption. His supporters hit the streets in protest, decrying the arrest as a politically motivated move by Modi aimed at weakening a rival. Soon after the arrest, Kejriwal implored his supporters to stay strong. "There are some forces who are trying to weaken our country and its democracy," he said in a 34-second audio clip posted to social media by a fellow party member.


AI 'healthcare revolution' already under way, Nvidia says

Al Jazeera

Taipei, Taiwan โ€“ Generative artificial intelligence (AI) has already brought about a "healthcare revolution" and is set to transform everything from pharmaceutical research to patient diagnostics and post-operative treatment, a top executive at chip giant Nvidia has said. Kimberly Powell, vice president of healthcare at Nvidia, said on Wednesday while it is still "early days", healthcare will probably be more affected by AI than any other area of life. "Healthcare is probably the most impactful utility of generative AI that there will be," Powell said during Nvidia's AI Summit, held on the sidelines of the Computex expo in Taipei. Powell said AI is already making its mark in the field of developing and testing new drugs, which can take up to 15 years and cost up to 2bn under current timeframes. "We care about fast and fast means in this industry, that we'll be able to do more, and we know that drug discovery is essentially an infinite problem. You're looking at a chemical space and 10 to the 60th power potential chemical compounds," Powell said.


OpenAI insiders warn of a 'reckless' race for dominance

The Japan Times

A group of OpenAI insiders is blowing the whistle on what they say is a culture of recklessness and secrecy at the San Francisco artificial intelligence company, which is racing to build the most powerful AI systems ever created. The group, which includes nine current and former OpenAI employees, has rallied in recent days around shared concerns that the company has not done enough to prevent its AI systems from becoming dangerous. The members say OpenAI, which started as a nonprofit research lab and burst into public view with the 2022 release of ChatGPT, is putting a priority on profits and growth as it tries to build artificial general intelligence, or AGI, the industry term for a computer program capable of doing anything a human can.


Lumina-Next: Making Lumina-T2X Stronger and Faster with Next-DiT

arXiv.org Artificial Intelligence

Lumina-T2X is a nascent family of Flow-based Large Diffusion Transformers that establishes a unified framework for transforming noise into various modalities, such as images and videos, conditioned on text instructions. Despite its promising capabilities, Lumina-T2X still encounters challenges including training instability, slow inference, and extrapolation artifacts. In this paper, we present Lumina-Next, an improved version of Lumina-T2X, showcasing stronger generation performance with increased training and inference efficiency. We begin with a comprehensive analysis of the Flag-DiT architecture and identify several suboptimal components, which we address by introducing the Next-DiT architecture with 3D RoPE and sandwich normalizations. To enable better resolution extrapolation, we thoroughly compare different context extrapolation methods applied to text-to-image generation with 3D RoPE, and propose Frequency- and Time-Aware Scaled RoPE tailored for diffusion transformers. Additionally, we introduced a sigmoid time discretization schedule to reduce sampling steps in solving the Flow ODE and the Context Drop method to merge redundant visual tokens for faster network evaluation, effectively boosting the overall sampling speed. Thanks to these improvements, Lumina-Next not only improves the quality and efficiency of basic text-to-image generation but also demonstrates superior resolution extrapolation capabilities and multilingual generation using decoder-based LLMs as the text encoder, all in a zero-shot manner. To further validate Lumina-Next as a versatile generative framework, we instantiate it on diverse tasks including visual recognition, multi-view, audio, music, and point cloud generation, showcasing strong performance across these domains. By releasing all codes and model weights, we aim to advance the development of next-generation generative AI capable of universal modeling.


Active ML for 6G: Towards Efficient Data Generation, Acquisition, and Annotation

arXiv.org Artificial Intelligence

This paper explores the integration of active machine learning (ML) for 6G networks, an area that remains under-explored yet holds potential. Unlike passive ML systems, active ML can be made to interact with the network environment. It actively selects informative and representative data points for training, thereby reducing the volume of data needed while accelerating the learning process. While active learning research mainly focuses on data annotation, we call for a network-centric active learning framework that considers both annotation (i.e., what is the label) and data acquisition (i.e., which and how many samples to collect). Moreover, we explore the synergy between generative artificial intelligence (AI) and active learning to overcome existing limitations in both active learning and generative AI. This paper also features a case study on a mmWave throughput prediction problem to demonstrate the practical benefits and improved performance of active learning for 6G networks. Furthermore, we discuss how the implications of active learning extend to numerous 6G network use cases. We highlight the potential of active learning based 6G networks to enhance computational efficiency, data annotation and acquisition efficiency, adaptability, and overall network intelligence. We conclude with a discussion on challenges and future research directions for active learning in 6G networks, including development of novel query strategies, distributed learning integration, and inclusion of human- and machine-in-the-loop learning.


Deep Generative Models for Proton Zero Degree Calorimeter Simulations in ALICE, CERN

arXiv.org Artificial Intelligence

Simulating detector responses is a crucial part of understanding the inner-workings of particle collisions in the Large Hadron Collider at CERN. The current reliance on statistical Monte-Carlo simulations strains CERN's computational grid, underscoring the urgency for more efficient alternatives. Addressing these challenges, recent proposals advocate for generative machine learning methods. In this study, we present an innovative deep learning simulation approach tailored for the proton Zero Degree Calorimeter in the ALICE experiment. Leveraging a Generative Adversarial Network model with Selective Diversity Increase loss, we directly simulate calorimeter responses. To enhance its capabilities in modeling a broad range of calorimeter response intensities, we expand the SDI-GAN architecture with additional regularization. Moreover, to improve the spatial fidelity of the generated data, we introduce an auxiliary regressor network. Our method offers a significant speedup when comparing to the traditional Monte-Carlo based approaches.


Unified Generation, Reconstruction, and Representation: Generalized Diffusion with Adaptive Latent Encoding-Decoding

arXiv.org Artificial Intelligence

The vast applications of deep generative models are anchored in three core capabilities -- generating new instances, reconstructing inputs, and learning compact representations -- across various data types, such as discrete text/protein sequences and continuous images. Existing model families, like variational autoencoders (VAEs), generative adversarial networks (GANs), autoregressive models, and (latent) diffusion models, generally excel in specific capabilities and data types but fall short in others. We introduce Generalized Encoding-Decoding Diffusion Probabilistic Models (EDDPMs) which integrate the core capabilities for broad applicability and enhanced performance. EDDPMs generalize the Gaussian noising-denoising in standard diffusion by introducing parameterized encoding-decoding. Crucially, EDDPMs are compatible with the well-established diffusion model objective and training recipes, allowing effective learning of the encoder-decoder parameters jointly with diffusion. By choosing appropriate encoder/decoder (e.g., large language models), EDDPMs naturally apply to different data types. Extensive experiments on text, proteins, and images demonstrate the flexibility to handle diverse data and tasks and the strong improvement over various existing models.


Synthetic Oversampling: Theory and A Practical Approach Using LLMs to Address Data Imbalance

arXiv.org Machine Learning

Imbalanced data and spurious correlations are common challenges in machine learning and data science. Oversampling, which artificially increases the number of instances in the underrepresented classes, has been widely adopted to tackle these challenges. In this article, we introduce OPAL (\textbf{O}versam\textbf{P}ling with \textbf{A}rtificial \textbf{L}LM-generated data), a systematic oversampling approach that leverages the capabilities of large language models (LLMs) to generate high-quality synthetic data for minority groups. Recent studies on synthetic data generation using deep generative models mostly target prediction tasks. Our proposal differs in that we focus on handling imbalanced data and spurious correlations. More importantly, we develop a novel theory that rigorously characterizes the benefits of using the synthetic data, and shows the capacity of transformers in generating high-quality synthetic data for both labels and covariates. We further conduct intensive numerical experiments to demonstrate the efficacy of our proposed approach compared to some representative alternative solutions.


AI video start-ups race ahead as Big Tech competition looms

Washington Post - Technology News

After the explosion of interest in text- and image-generators, using AI to generate videos is considered the next frontier and start-ups and Big Tech companies alike are investing in the space. Video tools from OpenAI and Google aren't publicly available yet, so start-ups like Pika are moving fast to expand before the bigger companies put out their own commercial and consumer-focused tools. But creating a video with AI is much more technically difficult than making a still image, and requires a huge amount of computer processing power, making it an expensive and slow process.


Former OpenAI, Google and Anthropic workers are asking AI companies for more whistleblower protections

Engadget

A group of current and former employees from leading AI companies like OpenAI, Google DeepMind and Anthropic have signed an open letter asking for greater transparency and protection from retaliation for those who speak out about the potential concerns of AI. "So long as there is no effective government oversight of these corporations, current and former employees are among the few people who can hold them accountable to the public," the letter, which was published on Tuesday, says. "Yet broad confidentiality agreements block us from voicing our concerns, except to the very companies that may be failing to address these issues." The letter comes just a couple of weeks after a Vox investigation revealed OpenAI had attempted to muzzle recently departing employees by forcing them to chose between signing an aggressive non-disparagement agreement, or risk losing their vested equity in the company. After the report, OpenAI CEO Sam Altman called the provision "genuinely embarrassing" and claims it has been removed from recent exit documentation, though it's unclear if it remains in force for some employees. The 13 signatories include former OpenAI employees Jacob Hinton, William Saunders and Daniel Kokotajlo.