Generative AI
OpenAI rakes in over 6 billion in new funding
Now that OpenAI is becoming a for-profit company, it's making a tidy profit in the process. The Wall Street Journal reported that OpenAI has raised 6.6 billion in new funding from investors, nearly doubling its value to 157 billion. The new funding also makes it the largest venture capital deal in history. The new investors jumped on board after the artificial intelligence startup planned to switch from a charitable non-profit to a for-profit, product-focused company. If OpenAI fails to make the move to for-profit, investors have the right to pull their funding, according to Axios.
OpenAI raises 6.6bn in funding, is valued at 157bn
OpenAI has raised 6.6bn ( 5bn) in a funding round that values the artificial intelligence business at 157bn, with chipmaker Nvidia and Japanese group SoftBank among its investors. The San Francisco-based startup, responsible for the ChatGPT chatbot, did not give details of a reported restructuring that will transform it into a for-profit business. The funding round was led by Thrive Capital, a US venture capital fund, and other backers include MGX, an Abu Dhabi-backed investment firm. OpenAI's post-fundraising valuation puts it on a par with Uber, although it remains far below the 3tn level of its biggest backer of recent years, Microsoft, which also joined the fundraising. Other investors included Nvidia, a dominant player in the market for the chips that train and operate AI models, and Softbank, which counts the UK chip designer Arm among its investments.
ChatGPT added 50 million weekly users in just two months
It's little wonder that investors were clamoring to plow money into OpenAI. Alongside an announcement that the company had raised 6.6 billion in funding, OpenAI revealed that "every week, over 250 million people around the world use ChatGPT to enhance their work, creativity, and learning." That's a sharp rise since late August, when OpenAI said the chatbot had 200 million weekly users -- double the number it had last November. As of June, 350 million people were using OpenAI's tools each month, according to internal documents obtained by The New York Times. It's unclear how many people are paying for access versus those using the free tier.
Hacking Generative AI for Fun and Profit
You hardly need ChatGPT to generate a list of reasons why generative artificial intelligence is often less than awesome. The way algorithms are fed creative work often without permission, harbor nasty biases, and require huge amounts of energy and water for training are all serious issues. Putting all that aside for a moment, though, it is remarkable how powerful generative AI can be for prototyping potentially useful new tools. I got to witness this firsthand by visiting Sundai Club, a generative AI hackathon that takes place one Sunday each month near the MIT campus. A few months ago, the group kindly agreed to let me sit in and chose to spend that session exploring tools that might be useful to journalists.
Facebook and Instagram users are fuming over controversial Meta AI move - here's how YOU can opt-out
Meta has started notifying Instagram and Facebook users across the UK that it is training its AI with their posts – and people are not happy about it. In emails and notifications being sent to UK users, Meta says it's using posts, comments, photos and even captions to help develop its human-like'generative AI', akin to ChatGPT. By being trained with UK user data, Meta told MailOnline that the AI will'reflect and understand British language, geography and culture'. Social media users are fuming over the controversial move, with one person saying the tech giant can'f*** right off'. If you don't want your personal data being handed over to Meta's AI training programme, here's how you can object.
The Impact of Generative AI on Collaborative Open-Source Software Development: Evidence from GitHub Copilot
Song, Fangchen, Agarwal, Ashish, Wen, Wen
Generative artificial intelligence (AI) has opened the possibility of automated content production, including coding in software development, which can significantly influence the participation and performance of software developers. To explore this impact, we investigate the role of GitHub Copilot, a generative AI pair programmer, on software development in open-source community, where multiple developers voluntarily collaborate on software projects. Using GitHub's dataset for open-source repositories and a generalized synthetic control method, we find that Copilot significantly enhances project-level productivity by 6.5%. Delving deeper, we dissect the key mechanisms driving this improvement. Our findings reveal a 5.5% increase in individual productivity and a 5.4% increase in participation. However, this is accompanied with a 41.6% increase in integration time, potentially due to higher coordination costs. Interestingly, we also observe the differential effects among developers. We discover that core developers achieve greater project-level productivity gains from using Copilot, benefiting more in terms of individual productivity and participation compared to peripheral developers, plausibly due to their deeper familiarity with software projects. We also find that the increase in project-level productivity is accompanied with no change in code quality. We conclude that AI pair programmers bring benefits to developers to automate and augment their code, but human developers' knowledge of software projects can enhance the benefits. In summary, our research underscores the role of AI pair programmers in impacting project-level productivity within the open-source community and suggests potential implications for the structure of open-source software projects.
Zodiac: A Cardiologist-Level LLM Framework for Multi-Agent Diagnostics
Zhou, Yuan, Zhang, Peng, Song, Mengya, Zheng, Alice, Lu, Yiwen, Liu, Zhiheng, Chen, Yong, Xi, Zhaohan
Large language models (LLMs) have demonstrated remarkable progress in healthcare. However, a significant gap remains regarding LLMs' professionalism in domain-specific clinical practices, limiting their application in real-world diagnostics. In this work, we introduce ZODIAC, an LLM-powered framework with cardiologist-level professionalism designed to engage LLMs in cardiological diagnostics. ZODIAC assists cardiologists by extracting clinically relevant characteristics from patient data, detecting significant arrhythmias, and generating preliminary reports for the review and refinement by cardiologists. To achieve cardiologist-level professionalism, ZODIAC is built on a multi-agent collaboration framework, enabling the processing of patient data across multiple modalities. Each LLM agent is fine-tuned using real-world patient data adjudicated by cardiologists, reinforcing the model's professionalism. ZODIAC undergoes rigorous clinical validation with independent cardiologists, evaluated across eight metrics that measure clinical effectiveness and address security concerns. Results show that ZODIAC outperforms industry-leading models, including OpenAI's GPT-4o, Meta's Llama-3.1-405B, and Google's Gemini-pro, as well as medical-specialist LLMs like Microsoft's BioGPT. ZODIAC demonstrates the transformative potential of specialized LLMs in healthcare by delivering domain-specific solutions that meet the stringent demands of medical practice. Notably, ZODIAC has been successfully integrated into electrocardiography (ECG) devices, exemplifying the growing trend of embedding LLMs into Software-as-Medical-Device (SaMD).
Discrete Copula Diffusion
Liu, Anji, Broadrick, Oliver, Niepert, Mathias, Broeck, Guy Van den
Discrete diffusion models have recently shown significant progress in modeling complex data, such as natural languages and DNA sequences. However, unlike diffusion models for continuous data, which can generate high-quality samples in just a few denoising steps, modern discrete diffusion models still require hundreds or even thousands of denoising steps to perform well. In this paper, we identify a fundamental limitation that prevents discrete diffusion models from achieving strong performance with fewer steps -- they fail to capture dependencies between output variables at each denoising step. To address this issue, we provide a formal explanation and introduce a general approach to supplement the missing dependency information by incorporating another deep generative model, termed the copula model. Our method does not require fine-tuning either the diffusion model or the copula model, yet it enables high-quality sample generation with significantly fewer denoising steps. When we apply this approach to autoregressive copula models, the combined model outperforms both models individually in unconditional and conditional text generation. Specifically, the hybrid model achieves better (un)conditional text generation using 8 to 32 times fewer denoising steps than the diffusion model alone. In addition to presenting an effective discrete diffusion generation algorithm, this paper emphasizes the importance of modeling inter-variable dependencies in discrete diffusion.
A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image Generation
Chen, Liang, Tan, Sinan, Cai, Zefan, Xie, Weichu, Zhao, Haozhe, Zhang, Yichi, Lin, Junyang, Bai, Jinze, Liu, Tianyu, Chang, Baobao
Figure 1: Generations from DnD-Transformers trained on class-conditional ImageNet256 256 (a.top) and unconditional arXiv images (a.bottom). Unconditional rich-text image generations by trained diffusion (b.1) and autoregressive model (b.2), This work tackles the information loss bottleneck of vector-quantization (VQ) autoregressive image generation by introducing a novel model architecture called the 2-Dimensional Autoregression (DnD) Transformer. The DnD-Transformer predicts more codes for an image by introducing a new autoregression direction, model depth, along with the sequence length direction. Compared to traditional 1D autoregression and previous work utilizing similar 2D image decomposition such as RQ-Transformer, the DnD-Transformer is an end-to-end model that can generate higher quality images with the same backbone model size and sequence length, opening a new optimization perspective for autoregressive image generation. Furthermore, our experiments reveal that the DnD-Transformer's potential extends beyond generating natural images. It can even generate images with rich text and graphical elements in a self-supervised manner, demonstrating an understanding of these combined modalities. This has not been previously demonstrated for popular vision generative models such as diffusion models, showing a spark of vision-language intelligence when trained solely on images. The field of autoregressive (AR) image generation is experiencing a resurgence of interest, largely driven by groundbreaking advancements in large language models (LLMs), exemplified by the release of ChatGPT (OpenAI, 2022). Because typical AR image generation methods also predict output in a next-token prediction manner, this resemblance has sparked significant efforts in two main areas: 1) transferring advanced, large-scale training techniques and expertise from LLMs to AR image generation models (Bai et al., 2023; Tian et al., 2024; Sun et al., 2024), and 2) developing truly multimodal foundation models capable of both understanding and generating multimodal information within a unified training framework (Lu et al., 2022; 2023; Team, 2024).
Bellman Diffusion: Generative Modeling as Learning a Linear Operator in the Distribution Space
Li, Yangming, Lai, Chieh-Hsin, Schönlieb, Carola-Bibiane, Mitsufuji, Yuki, Ermon, Stefano
Deep Generative Models (DGMs), including Energy-Based Models (EBMs) and Score-based Generative Models (SGMs), have advanced high-fidelity data generation and complex continuous distribution approximation. However, their application in Markov Decision Processes (MDPs), particularly in distributional Reinforcement Learning (RL), remains underexplored, with conventional histogram-based methods dominating the field. This paper rigorously highlights that this application gap is caused by the nonlinearity of modern DGMs, which conflicts with the linearity required by the Bellman equation in MDPs. For instance, EBMs involve nonlinear operations such as exponentiating energy functions and normalizing constants. To address this, we introduce Bellman Diffusion, a novel DGM framework that maintains linearity in MDPs through gradient and scalar field modeling. With divergence-based training techniques to optimize neural network proxies and a new type of stochastic differential equation (SDE) for sampling, Bellman Diffusion is guaranteed to converge to the target distribution. Our empirical results show that Bellman Diffusion achieves accurate field estimations and is a capable image generator, converging 1.5x faster than the traditional histogram-based baseline in distributional RL tasks. This work enables the effective integration of DGMs into MDP applications, unlocking new avenues for advanced decision-making frameworks.