Gelderland
The surprising benefits of video games
Breakthroughs, discoveries, and DIY tips sent every weekday. There are plenty of negative stereotypes about games and gamers. And it's true that focusing on gaming to the detriment of all else will have negative effects--there's a reason that the World Health Organization recognizes video game addiction as a mental health condition. In the 50 years since Atari unleashed Pong on the world, there's been plenty of research on the effects of video games on our brains, and it's not all bad. Here are a few of the potential benefits of gaming, according to research. A research review published in American Psychologist in 2013 by Isabela Granic, Adam Lobel, and Rutger C. M. E. Engels at Radboud University in Nijmegen, the Netherlands, looked at decades of research and highlighted the various benefits found in gaming.
Reading Miscue Detection in Primary School through Automatic Speech Recognition
Gao, Lingyun, Tejedor-Garcia, Cristian, Strik, Helmer, Cucchiarini, Catia
Automatic reading diagnosis systems can benefit both teachers for more efficient scoring of reading exercises and students for accessing reading exercises with feedback more easily. However, there are limited studies on Automatic Speech Recognition (ASR) for child speech in languages other than English, and limited research on ASR-based reading diagnosis systems. This study investigates how efficiently state-of-the-art (SOTA) pretrained ASR models recognize Dutch native children speech and manage to detect reading miscues. We found that Hubert Large finetuned on Dutch speech achieves SOTA phoneme-level child speech recognition (PER at 23.1\%), while Whisper (Faster Whisper Large-v2) achieves SOTA word-level performance (WER at 9.8\%). Our findings suggest that Wav2Vec2 Large and Whisper are the two best ASR models for reading miscue detection. Specifically, Wav2Vec2 Large shows the highest recall at 0.83, whereas Whisper exhibits the highest precision at 0.52 and an F1 score of 0.52.
Composite Quantile Regression With XGBoost Using the Novel Arctan Pinball Loss
Sluijterman, Laurens, Kreuwel, Frank, Cator, Eric, Heskes, Tom
This paper explores the use of XGBoost for composite quantile regression. XGBoost is a highly popular model renowned for its flexibility, efficiency, and capability to deal with missing data. The optimization uses a second order approximation of the loss function, complicating the use of loss functions with a zero or vanishing second derivative. Quantile regression -- a popular approach to obtain conditional quantiles when point estimates alone are insufficient -- unfortunately uses such a loss function, the pinball loss. Existing workarounds are typically inefficient and can result in severe quantile crossings. In this paper, we present a smooth approximation of the pinball loss, the arctan pinball loss, that is tailored to the needs of XGBoost. Specifically, contrary to other smooth approximations, the arctan pinball loss has a relatively large second derivative, which makes it more suitable to use in the second order approximation. Using this loss function enables the simultaneous prediction of multiple quantiles, which is more efficient and results in far fewer quantile crossings.
NeuSpeech: Decode Neural signal as Speech
Yang, Yiqian, Duan, Yiqun, Zhang, Qiang, Jo, Hyejeong, Zhou, Jinni, Lee, Won Hee, Xu, Renjing, Xiong, Hui
Decoding language from brain dynamics is an important open direction in the realm of brain-computer interface (BCI), especially considering the rapid growth of large language models. Compared to invasive-based signals which require electrode implantation surgery, non-invasive neural signals (e.g. EEG, MEG) have attracted increasing attention considering their safety and generality. However, the exploration is not adequate in three aspects: 1) previous methods mainly focus on EEG but none of the previous works address this problem on MEG with better signal quality; 2) prior works have predominantly used $``teacher-forcing"$ during generative decoding, which is impractical; 3) prior works are mostly $``BART-based"$ not fully auto-regressive, which performs better in other sequence tasks. In this paper, we explore the brain-to-text translation of MEG signals in a speech-decoding formation. Here we are the first to investigate a cross-attention-based ``whisper" model for generating text directly from MEG signals without teacher forcing. Our model achieves impressive BLEU-1 scores of 60.30 and 52.89 without pretraining $\&$ teacher-forcing on two major datasets ($\textit{GWilliams}$ and $\textit{Schoffelen}$). This paper conducts a comprehensive review to understand how speech decoding formation performs on the neural decoding tasks, including pretraining initialization, training $\&$ evaluation set splitting, augmentation, and scaling law. Code is available at https://github.com/NeuSpeech/NeuSpeech1$.
Acquiring Better Load Estimates by Combining Anomaly and Change-point Detection in Power Grid Time-series Measurements
Bouman, Roel, Schmeitz, Linda, Buise, Luco, Heres, Jacco, Shapovalova, Yuliya, Heskes, Tom
In this paper we present novel methodology for automatic anomaly and switch event filtering to improve load estimation in power grid systems. By leveraging unsupervised methods with supervised optimization, our approach prioritizes interpretability while ensuring robust and generalizable performance on unseen data. Through experimentation, a combination of binary segmentation for change point detection and statistical process control for anomaly detection emerges as the most effective strategy, specifically when ensembled in a novel sequential manner. Results indicate the clear wasted potential when filtering is not applied. The automatic load estimation is also fairly accurate, with approximately 90% of estimates falling within a 10% error margin, with only a single significant failure in both the minimum and maximum load estimates across 60 measurements in the test set. Our methodology's interpretability makes it particularly suitable for critical infrastructure planning, thereby enhancing decision-making processes.
Synthesizing EEG Signals from Event-Related Potential Paradigms with Conditional Diffusion Models
Klein, Guido, Guetschel, Pierre, Silvestri, Gianluigi, Tangermann, Michael
Data scarcity in the brain-computer interface field can be alleviated through the use of generative models, specifically diffusion models. While diffusion models have previously been successfully applied to electroencephalogram (EEG) data, existing models lack flexibility w.r.t.~sampling or require alternative representations of the EEG data. To overcome these limitations, we introduce a novel approach to conditional diffusion models that utilizes classifier-free guidance to directly generate subject-, session-, and class-specific EEG data. In addition to commonly used metrics, domain-specific metrics are employed to evaluate the specificity of the generated samples. The results indicate that the proposed model can generate EEG data that resembles real data for each subject, session, and class.
Deep Learning TB Detection Shows Potential for Low-Resource Countries
Researchers have found that an artificial intelligence system is at least as good as human radiologists at identifying tuberculosis from chest X-rays, opening up its use for low-resource countries. Indeed, the deep learning program was superior in sensitivity and noninferior in specificity in identifying active pulmonary TB in frontal chest radiographs when compared with nine radiologists from India. The system could have particular value in low-income countries where large-scale screening programs are not always feasible due to cost and radiologist availability. Simulations revealed that using the deep learning system to identify likely TB-positive chest radiographs for confirmation using nucleic acid amplification testing (NAAT) reduced costs by between 40 and 80 percent per positive patient detected. "We hope this can be a tool used by non-expert physicians and healthcare workers to screen people en masse and get them to treatment where required without getting specialist doctors, who are in short supply,' said researcher Rory Pilgrim, a product manager at Google Health AI in Mountain View, California. "We believe we can do this with the people on the ground in a low-cost, high-volume way." The research is published in Radiology, a journal of the Radiological Society of North America. The deep-learning system was trained using 165,754 images from 22,284 individuals, nearly all from South Africa, and then tested using data from five countries. The total test set had 1236 images, of which 212 were identified as positive for TB based on microbiological tests or NAAT. These were binary scored by 10 radiologists from India and five from the USA, although one of the Indian radiologists was removed due to their much lower specificity than the others. Among 1236 test individuals assessed, the deep learning system achieved superior sensitivity compared with a prespecified analysis involving the nine radiologists from India, at 88% versus 75%, with noninferior specificity at 79% versus 84%. "What's especially promising in this study is that we looked at a range of different datasets that reflected the breadth of TB presentation, different equipment and different clinical workflows," said co-study author Sahar Kazemzadeh, software engineer at Google Health. The AI system achieved thresholds set by the World Health Organization in 2014 as a reasonable requirement for any TB screening test in most of the data sets, noted Bram van Ginneken, a professor of medical image analysis at Radboud University Medical Center in Nijmegen, The Netherlands, in an editorial accompanying the study. Yet, he added: "It is shown that for difficult data sets, such as a mining population, whose radiographs may contain other signs of lung disease, and a subset of subjects who are HIV positive, where TB may occur without typical radiographic abnormalities, both the AI software and the human readers performed much lower.
Promise and problems: AI put patients at risk but that shouldn't prevent us developing it. How do we implement artificial intelligence in clinical settings?
In a classic case of finding a balance between costs and benefits of science, researchers are grappling with the question of how artificial intelligence in medicine can and should be applied to clinical patient care โ despite knowing that there are examples where it puts patients' lives at risk. The question was central to a recent university of Adelaide seminar, part of the Research Tuesdays lecture series, titled "Antidote AI." As artificial intelligence grows in sophistication and usefulness, we have begun to see it appearing more and more in everyday life. From AI traffic control and ecological studies, to machine learning finding the origins of a Martian meteorite and reading Arnhem Land rock art, the possibilities seem endless for AI research. The genuine excitement clinicians and artificial intelligence researchers feel for the prospect of AI assisting in patient care is palpable and honourable. Medicine is, after all, about helping people and the ethical foundation is "do no harm."
Harbour seals can learn how to change their voices to seem bigger
Consider the squeak of a mouse and the low rumble of a lion's roar. In the animal kingdom, bigger animals usually produce lower pitch sounds as a result of their larger larynges and longer vocal tracts. But harbour seals seem to break that rule: they can learn how to change their calls. That means they can deliberately move between lower or higher pitch sounds and make themselves sound bigger than they really are. "The information that is in their calls is not necessarily honest," says Koen de Reus at the Max Planck Institute for Psycholinguistics in Nijmegen, Netherlands.
Embedded-model flows: Combining the inductive biases of model-free deep learning and explicit probabilistic modeling
Silvestri, Gianluigi, Fertig, Emily, Moore, Dave, Ambrogioni, Luca
Normalizing flows have shown great success as general-purpose density estimators. However, many real world applications require the use of domain-specific knowledge, which normalizing flows cannot readily incorporate. We propose embedded-model flows (EMF), which alternate general-purpose transformations with structured layers that embed domain-specific inductive biases. These layers are automatically constructed by converting user-specified differentiable probabilistic models into equivalent bijective transformations. We also introduce gated structured layers, which allow bypassing the parts of the models that fail to capture the statistics of the data. We demonstrate that EMFs can be used to induce desirable properties such as multimodality, hierarchical coupling and continuity. Furthermore, we show that EMFs enable a high performance form of variational inference where the structure of the prior model is embedded in the variational architecture. In our experiments, we show that this approach outperforms state-of-the-art methods in common structured inference problems.