Generative AI
US chip designer Nvidia forecasts Q3 rev above target, shares soar
Chip designer Nvidia has forecast third-quarter revenue above Wall Street targets and said it will buy back $25bn more of its shares as sales benefit from soaring demand for its chips that power nearly all the world's major artificial intelligence apps. Shares of the Santa Clara, California-based company rose 8 percent in trading after the bell, hitting an all-time high. Nvidia's forecast on Wednesday beat expectations by billions of dollars, demonstrating that a boom in generative AI technologies that can read and write in human-like ways โ and powered almost exclusively by Nvidia's chips โ shows no signs of slowing down. Nvidia's additional $25bn in share repurchases come as shares have already tripled this year, making the company the first-ever trillion-dollar chip business as investors bet Nvidia will be the key beneficiary of the AI boom. Analysts have estimated that demand for Nvidia's prized AI chips is exceeding supply by at least 50 percent, adding that the imbalance will stay in place for the next several quarters.
Nvidia stock surges to highest ever as AI boom rolls on
Nvidia has been one of the primary beneficiaries of the wave of interest in "generative" AI tools spurred by the launch of OpenAI's chatbot ChatGPT last November. Nvidia's computer chips are specially suited to help AI programs crunch the huge amounts of data necessary to give them the capability to have complex conversations, translate language and generate images based on simple prompts. Though there are signs that interest from consumers in chatbots is waning, large companies are busy buying as many of Nvidia's chips as they can to give themselves the ability to train and run AI software of their own.
Kids Are Going Back to School. So Is ChatGPT
Last winter, the unveiling of OpenAI's alarmingly sophisticated chatbot sent educators into a tailspin. Generative AI, it was feared, would enable rampant cheating and plagiarism, and even make high school English obsolete. Universities debated updating plagiarism policies. Some school districts outright banned ChatGPT from their networks. Now, a new school year presents new challenges--and, for some, new opportunities.
Back to school with AI: How parents and educators can ensure its ethical use in the classroom
AI technology is quickly creeping into every industry, prompting new questions about whether online content comes from a human or a computer. The presence of advanced technology in the classroom may require conversations with students during this new school year. As artificial intelligence finds its way into more families' day-to-day routines, parents and teachers alike should be wary of how their kids are interacting with generative AI. This is according to SmartNews' head of trust and safety Arjun Narayan, who shared concerns during an interview with Fox News Digital. "As with any new technology, when it is very new, it's important to understand how you're engaging with that tech," said Narayan, who is based in Japan.
How to Protect Copyright Data in Optimization of Large Language Models?
Chu, Timothy, Song, Zhao, Yang, Chiwun
Large language models (LLMs) and generative AI have played a transformative role in computer research and applications. Controversy has arisen as to whether these models output copyrighted data, which can occur if the data the models are trained on is copyrighted. LLMs are built on the transformer neural network architecture, which in turn relies on a mathematical computation called Attention that uses the softmax function. In this paper, we show that large language model training and optimization can be seen as a softmax regression problem. We then establish a method of efficiently performing softmax regression, in a way that prevents the regression function from generating copyright data. This establishes a theoretical method of training large language models in a way that avoids generating copyright data.
Generative AI for End-to-End Limit Order Book Modelling: A Token-Level Autoregressive Generative Model of Message Flow Using a Deep State Space Network
Nagy, Peer, Frey, Sascha, Sapora, Silvia, Li, Kang, Calinescu, Anisoara, Zohren, Stefan, Foerster, Jakob
Developing a generative model of realistic order flow in financial markets is a challenging open problem, with numerous applications for market participants. Addressing this, we propose the first end-to-end autoregressive generative model that generates tokenized limit order book (LOB) messages. These messages are interpreted by a Jax-LOB simulator, which updates the LOB state. To handle long sequences efficiently, the model employs simplified structured state-space layers to process sequences of order book states and tokenized messages. Using LOBSTER data of NASDAQ equity LOBs, we develop a custom tokenizer for message data, converting groups of successive digits to tokens, similar to tokenization in large language models. Out-of-sample results show promising performance in approximating the data distribution, as evidenced by low model perplexity. Furthermore, the mid-price returns calculated from the generated order flow exhibit a significant correlation with the data, indicating impressive conditional forecast performance. Due to the granularity of generated data, and the accuracy of the model, it offers new application areas for future work beyond forecasting, e.g. acting as a world model in high-frequency financial reinforcement learning applications. Overall, our results invite the use and extension of the model in the direction of autoregressive large financial models for the generation of high-frequency financial data and we commit to open-sourcing our code to facilitate future research.
Augmenting medical image classifiers with synthetic data from latent diffusion models
Sagers, Luke W., Diao, James A., Melas-Kyriazi, Luke, Groh, Matthew, Rajpurkar, Pranav, Adamson, Adewole S., Rotemberg, Veronica, Daneshjou, Roxana, Manrai, Arjun K.
While hundreds of artificial intelligence (AI) algorithms are now approved or cleared by the US Food and Drugs Administration (FDA), many studies have shown inconsistent generalization or latent bias, particularly for underrepresented populations. Some have proposed that generative AI could reduce the need for real data, but its utility in model development remains unclear. Skin disease serves as a useful case study in synthetic image generation due to the diversity of disease appearance, particularly across the protected attribute of skin tone. Here we show that latent diffusion models can scalably generate images of skin disease and that augmenting model training with these data improves performance in data-limited settings. These performance gains saturate at synthetic-to-real image ratios above 10:1 and are substantially smaller than the gains obtained from adding real images. As part of our analysis, we generate and analyze a new dataset of 458,920 synthetic images produced using several generation strategies. Our results suggest that synthetic data could serve as a force-multiplier for model development, but the collection of diverse real-world data remains the most important step to improve medical AI algorithms.
Quantized Radio Map Estimation Using Tensor and Deep Generative Models
Timilsina, Subash, Shrestha, Sagar, Fu, Xiao
Spectrum cartography (SC), also known as radio map estimation (RME), aims at crafting multi-domain (e.g., frequency and space) radio power propagation maps from limited sensor measurements. While early methods often lacked theoretical support, recent works have demonstrated that radio maps can be provably recovered using low-dimensional models -- such as the block-term tensor decomposition (BTD) model and certain deep generative models (DGMs) -- of the high-dimensional multi-domain radio signals. However, these existing provable SC approaches assume that sensors send real-valued (full-resolution) measurements to the fusion center, which is unrealistic. This work puts forth a quantized SC framework that generalizes the BTD and DGM-based SC to scenarios where heavily quantized sensor measurements are used. A maximum likelihood estimation (MLE)-based SC framework under a Gaussian quantizer is proposed. Recoverability of the radio map using the MLE criterion are characterized under realistic conditions, e.g., imperfect radio map modeling and noisy measurements. Simulations and real-data experiments are used to showcase the effectiveness of the proposed approach.
AI more likely to change work than destroy jobs: U.N. study
Artificial Intelligence is more likely to augment jobs than to destroy them, a U.N. study indicated on Monday, at a time of growing anxiety over the potential impact of the technology. The launch in November of the generative AI platform ChatGPT, which is capable of handling complex tasks on command, was seen as a technology landmark foreshadowing a potentially dramatic transformation of the workplace. But a fresh study from the United Nations' International Labour Organization (ILO) examining the potential effect of that and other platforms on job quantity and quality suggests that most jobs and industries are only partially exposed to automation.
Variational Autoencoding Molecular Graphs with Denoising Diffusion Probabilistic Model
Koge, Daiki, Ono, Naoaki, Kanaya, Shigehiko
In data-driven drug discovery, designing molecular descriptors is a very important task. Deep generative models such as variational autoencoders (VAEs) offer a potential solution by designing descriptors as probabilistic latent vectors derived from molecular structures. These models can be trained on large datasets, which have only molecular structures, and applied to transfer learning. Nevertheless, the approximate posterior distribution of the latent vectors of the usual VAE assumes a simple multivariate Gaussian distribution with zero covariance, which may limit the performance of representing the latent features. To overcome this limitation, we propose a novel molecular deep generative model that incorporates a hierarchical structure into the probabilistic latent vectors. We achieve this by a denoising diffusion probabilistic model (DDPM). We demonstrate that our model can design effective molecular latent vectors for molecular property prediction from some experiments by small datasets on physical properties and activity. The results highlight the superior prediction performance and robustness of our model compared to existing approaches.