Generative AI
Scaling laws for decoding images from brain activity
Banville, Hubert, Benchetrit, Yohann, d'Ascoli, Stรฉphane, Rapin, Jรฉrรฉmy, King, Jean-Rรฉmi
Generative AI has recently propelled the decoding of images from brain activity. How do these approaches scale with the amount and type of neural recordings? Here, we systematically compare image decoding from four types of non-invasive devices: electroencephalography (EEG), magnetoencephalography (MEG), high-field functional Magnetic Resonance Imaging (3T fMRI) and ultra-high field (7T) fMRI. For this, we evaluate decoding models on the largest benchmark to date, encompassing 8 public datasets, 84 volunteers, 498 hours of brain recording and 2.3 million brain responses to natural images. Unlike previous work, we focus on single-trial decoding performance to simulate real-time settings. This systematic comparison reveals three main findings. First, the most precise neuroimaging devices tend to yield the best decoding performances, when the size of the training sets are similar. However, the gain enabled by deep learning - in comparison to linear models - is obtained with the noisiest devices. Second, we do not observe any plateau of decoding performance as the amount of training data increases. Rather, decoding performance scales log-linearly with the amount of brain recording. Third, this scaling law primarily depends on the amount of data per subject. However, little decoding gain is observed by increasing the number of subjects. Overall, these findings delineate the path most suitable to scale the decoding of images from non-invasive brain recordings.
Text-to-Image Generation for Vocabulary Learning Using the Keyword Method
Attygalle, Nuwan T., Kljun, Matjaลพ, Quigley, Aaron, Pucihar, Klen ฤOpiฤ, Grubert, Jens, Biener, Verena, Leiva, Luis A., Yoneyama, Juri, Toniolo, Alice, Miguel, Angela, Kato, Hirokazu, Weerasinghe, Maheshya
The 'keyword method' is an effective technique for learning vocabulary of a foreign language. It involves creating a memorable visual link between what a word means and what its pronunciation in a foreign language sounds like in the learner's native language. However, these memorable visual links remain implicit in the people's mind and are not easy to remember for a large set of words. To enhance the memorisation and recall of the vocabulary, we developed an application that combines the keyword method with text-to-image generators to externalise the memorable visual links into visuals. These visuals represent additional stimuli during the memorisation process. To explore the effectiveness of this approach we first run a pilot study to investigate how difficult it is to externalise the descriptions of mental visualisations of memorable links, by asking participants to write them down. We used these descriptions as prompts for text-to-image generator (DALL-E2) to convert them into images and asked participants to select their favourites. Next, we compared different text-to-image generators (DALL-E2, Midjourney, Stable and Latent Diffusion) to evaluate the perceived quality of the generated images by each. Despite heterogeneous results, participants mostly preferred images generated by DALL-E2, which was used also for the final study. In this study, we investigated whether providing such images enhances the retention of vocabulary being learned, compared to the keyword method only. Our results indicate that people did not encounter difficulties describing their visualisations of memorable links and that providing corresponding images significantly improves memory retention.
Implementation of a Generative AI Assistant in K-12 Education: The CGScholar AI Helper Initiative
Castro, Vania, Nascimento, Ana Karina de Oliveira, Zheldibayeva, Raigul, Searsmith, Duane, Saini, Akash, Cope, Bill, Kalantzis, Mary
This paper focuses on the piloting of the CGScholar AI Helper, a Generative AI (GenAI) assistant tool that aims to provide feedback on writing in high school contexts. The aim was to use GenAI to provide formative and summative feedback on students' texts in English Language Arts (ELA) and History. The trials discussed in this paper relate to Grade 11, a crucial learning phase when students are working towards college readiness. These trials took place in two very different schools in the Midwest of the United States, one in a low socio-economic background with low-performance outcomes and the other in a high socio-economic background with high-performance outcomes. The assistant tool used two main mechanisms "prompt engineering" based on participant teachers' assessment rubric and "fine-tuning" a Large Language Model (LLM) from a customized corpus of teaching materials using Retrieval Augmented Generation (RAG). This paper focuses on the CGScholar AI Helper's potential to enhance students' writing abilities and support teachers in ELA and other subject areas requiring written assignments.
DebiasPI: Inference-time Debiasing by Prompt Iteration of a Text-to-Image Generative Model
Bonna, Sarah, Huang, Yu-Cheng, Novozhilova, Ekaterina, Paik, Sejin, Shan, Zhengyang, Feng, Michelle Yilin, Gao, Ge, Tayal, Yonish, Kulkarni, Rushil, Yu, Jialin, Divekar, Nupur, Ghadiyaram, Deepti, Wijaya, Derry, Betke, Margrit
Ethical intervention prompting has emerged as a tool to counter demographic biases of text-to-image generative AI models. Existing solutions either require to retrain the model or struggle to generate images that reflect desired distributions on gender and race. We propose an inference-time process called DebiasPI for Debiasing-by-Prompt-Iteration that provides prompt intervention by enabling the user to control the distributions of individuals' demographic attributes in image generation. DebiasPI keeps track of which attributes have been generated either by probing the internal state of the model or by using external attribute classifiers. Its control loop guides the text-to-image model to select not yet sufficiently represented attributes, With DebiasPI, we were able to create images with equal representations of race and gender that visualize challenging concepts of news headlines. We also experimented with the attributes age, body type, profession, and skin tone, and measured how attributes change when our intervention prompt targets the distribution of an unrelated attribute type. We found, for example, if the text-to-image model is asked to balance racial representation, gender representation improves but the skin tone becomes less diverse. Attempts to cover a wide range of skin colors with various intervention prompts showed that the model struggles to generate the palest skin tones. We conducted various ablation studies, in which we removed DebiasPI's attribute control, that reveal the model's propensity to generate young, male characters.
DeepSeek's Popular AI App Is Explicitly Sending US Data to China
The United States' recent regulatory action against the Chinese-owned social video platform TikTok prompted mass migration to another Chinese app, the social platform "Rednote." Now, a generative artificial intelligence platform from the Chinese developer DeepSeek is exploding in popularity, posing a potential threat to US AI dominance and offering the latest evidence that moratoriums like the TikTok ban will not stop Americans from using Chinese-owned digital services. DeepSeek, an AI research lab created by a prominent Chinese hedge fund, recently gained popularity after releasing its latest open source generative AI model that easily competes with top US platforms like those developed by OpenAI. However, to help avoid US sanctions on hardware and software, DeepSeek created some clever workarounds when building its models. On Monday, DeepSeek's creators limited new sign-ups after claiming the app had been overrun with a "large-scale malicious attack."
China's DeepSeek Surprise
One week ago, a new and formidable challenger for OpenAI's throne emerged. A Chinese AI start-up, DeepSeek, launched a model that appeared to match the most powerful version of ChatGPT but, at least according to its creator, was a fraction of the cost to build. The program, called DeepSeek-R1, has incited plenty of concern: Ultrapowerful Chinese AI models are exactly what many leaders of American AI companies feared when they, and more recently President Donald Trump, have sounded alarms about a technological race between the United States and the People's Republic of China. This is a "wake up call for America," Alexandr Wang, the CEO of Scale AI, commented on social media. But at the same time, many Americans--including much of the tech industry--appear to be lauding this Chinese AI.
What to Know About DeepSeek, the Chinese AI Company Causing Stock Market Chaos
A new Chinese AI model, created by the Hangzhou-based startup DeepSeek, has stunned the American AI industry by outperforming some of OpenAI's leading models, displacing ChatGPT at the top of the iOS app store, and usurping Meta as the leading purveyor of so-called open source AI tools. All of which has raised a critical question: despite American sanctions on Beijing's ability to access advanced semiconductors, is China catching up with the U.S. in the global AI race? At a supposed cost of just 6 million to train, DeepSeek's new R1 model, released last week, was able to match the performance on several math and reasoning metrics by OpenAI's o1 model โ the outcome of tens of billions of dollars in investment by OpenAI and its patron Microsoft. The Chinese model is also cheaper for users. The upshot: the U.S. tech industry is suddenly faced with a potentially cheaper and more powerful challenger, unnerving investors, who sold off American tech stocks on Monday morning.
Chinese AI App DeepSeek Soars in Popularity, Startling Rivals
An AI assistant created by Chinese startup DeepSeek became the number one most-downloaded app in Apple's US App Store over the weekend, sending shockwaves through Silicon Valley and causing the price of major tech stocks to plummet. Nvidia saw more than 460 billion erased from its market capitalization on Monday, a drop Bloomberg characterized as the "biggest in US stock market history." The shakeup stems from an open source model developed by DeepSeek called R1, which debuted earlier this month. The company said that it rivals the current industry leader: OpenAI's 01. But what stunned the tech industry the most was that DeepSeek claimed to have built its model using only a small fraction of the specialized computer chips that AI companies typically need to develop cutting-edge systems. On Monday, DeepSeek said it was temporarily limiting new registrations, citing "large-scale malicious attacks" on the company's services, according to a message on its website.
Reviews: Surfing: Iterative Optimization Over Incrementally Trained Deep Networks
The paper proposes a new method for provably fitting deep generative models to observations, a highly non-convex optimization problem. Instead of trying to find the latent code that explains the measurements directly, as proposed by Bora et al. this paper starts with a different deep generative model that has random weights, for which Hand et al. showed that gradient descent provably works. Then they incrementally modify the weights of the generator to approach the true generator while using the previous optimum as a starting point. This sequence of models can be snapshots of the model during the training process. The main result is a theory that shows that a warm-started non convex optimization in expansive Gaussian networks yields successful recovery.