AITopics | recording session

Collaborating Authors

recording session

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

AScalable, Causal, and Energy Efficient Framework for Neural Decoding with Spiking Neural Networks

Neural Information Processing SystemsJun-16-2026, 23:32:27 GMT

artificial intelligence, machine learning, survey article, (20 more...)

Neural Information Processing Systems

Country: North America > United States (0.27)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (0.92)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Education (1.00)
Energy (0.94)
Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

SPINT: Spatial Permutation-Invariant Neural Transformer for Consistent Intracortical Motor Decoding

Neural Information Processing SystemsJun-11-2026, 06:18:39 GMT

Intracortical Brain-Computer Interfaces (iBCI) decode behavior from neural population activity to restore motor functions and communication abilities in individuals with motor impairments. A central challenge for long-term iBCI deployment is the nonstationarity of neural recordings, where the composition and tuning profiles of the recorded populations are unstable across recording sessions. Existing approaches attempt to address this issue by explicit alignment techniques; however, they rely on fixed neural identities and require test-time labels or parameter updates, limiting their generalization across sessions and imposing additional computational burden during deployment. In this work, we address the problem of cross-session nonstationarity in long-term iBCI systems and introduce SPINT - a Spatial Permutation-Invariant Neural Transformer framework for behavioral decoding that operates directly on unordered sets of neural units. Central to our approach is a novel context-dependent positional embedding scheme that dynamically infers unit-specific identities, enabling flexible generalization across recording sessions. SPINT supports inference on variable-size populations and allows few-shot, gradient-free adaptation using a small amount of unlabeled data from the test session. We evaluate SPINT on three multi-session datasets from the FALCON Benchmark, covering continuous motor decoding tasks in human and non-human primates. SPINT demonstrates robust cross-session generalization, outperforming existing zero-shot and few-shot unsupervised baselines while eliminating the need for test-time alignment and fine-tuning. Our work contributes an initial step toward a robust and scalable neural decoding framework for long-term iBCI applications.

artificial intelligence, machine learning, proceedings, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.75)

Add feedback

A Cocktail-Party Benchmark: Multi-Modal dataset and Comparative Evaluation Results

Nguyen, Thai-Binh, Zmolikova, Katerina, Ma, Pingchuan, Pham, Ngoc Quan, Fuegen, Christian, Waibel, Alexander

arXiv.org Artificial IntelligenceOct-28-2025

We introduce the task of Multi-Modal Context-Aware Recognition (MCoRec) in the ninth CHiME Challenge, which addresses the cocktail-party problem of overlapping conversations in a single-room setting using audio, visual, and contextual cues. MCoRec captures natural multi-party conversations where the recordings focus on unscripted, casual group chats, leading to extreme speech overlap of up to 100% and highly fragmented conversational turns. The task requires systems to answer the question "Who speaks when, what, and with whom?" by jointly transcribing each speaker's speech and clustering them into their respective conversations from audio-visual recordings. Audio-only baselines exceed 100% word error rate, whereas incorporating visual cues yields substantial 50% improvements, highlighting the importance of multi-modality. In this manuscript, we present the motivation behind the task, outline the data collection process, and report the baseline systems developed for the MCoRec.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2510.23276

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.94)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.52)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.50)

Add feedback

A Scalable, Causal, and Energy Efficient Framework for Neural Decoding with Spiking Neural Networks

Mentzelopoulos, Georgios, Asmanis, Ioannis, Kording, Konrad P., Dyer, Eva L., Daniilidis, Kostas, Vitale, Flavia

arXiv.org Artificial IntelligenceOct-24-2025

Brain-computer interfaces (BCIs) promise to enable vital functions, such as speech and prosthetic control, for individuals with neuromotor impairments. Central to their success are neural decoders, models that map neural activity to intended behavior. Current learning-based decoding approaches fall into two classes: simple, causal models that lack generalization, or complex, non-causal models that generalize and scale offline but struggle in real-time settings. Both face a common challenge, their reliance on power-hungry artificial neural network backbones, which makes integration into real-world, resource-limited systems difficult. Spiking neural networks (SNNs) offer a promising alternative. Because they operate causally these models are suitable for real-time use, and their low energy demands make them ideal for battery-constrained environments. To this end, we introduce Spikachu: a scalable, causal, and energy-efficient neural decoding framework based on SNNs. Our approach processes binned spikes directly by projecting them into a shared latent space, where spiking modules, adapted to the timing of the input, extract relevant features; these latent representations are then integrated and decoded to generate behavioral predictions. We evaluate our approach on 113 recording sessions from 6 non-human primates, totaling 43 hours of recordings. Our method outperforms causal baselines when trained on single sessions using between 2.26 and 418.81 times less energy. Furthermore, we demonstrate that scaling up training to multiple sessions and subjects improves performance and enables few-shot transfer to unseen sessions, subjects, and tasks. Overall, Spikachu introduces a scalable, online-compatible neural decoding framework based on SNNs, whose performance is competitive relative to state-of-the-art models while consuming orders of magnitude less energy.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2510.20683

Country: North America > United States (0.27)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Energy (1.00)
Education (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Modality-Order Matters! A Novel Hierarchical Feature Fusion Method for CoSAm: A Code-Switched Autism Corpus

Akhtar, Mohd Mujtaba, Girish, null, Singh, Muskaan, Phukan, Orchid Chetia

arXiv.org Artificial IntelligenceJul-23-2024

Autism Spectrum Disorder (ASD) is a complex neuro-developmental challenge, presenting a spectrum of difficulties in social interaction, communication, and the expression of repetitive behaviors in different situations. This increasing prevalence underscores the importance of ASD as a major public health concern and the need for comprehensive research initiatives to advance our understanding of the disorder and its early detection methods. This study introduces a novel hierarchical feature fusion method aimed at enhancing the early detection of ASD in children through the analysis of code-switched speech (English and Hindi). Employing advanced audio processing techniques, the research integrates acoustic, paralinguistic, and linguistic information using Transformer Encoders. This innovative fusion strategy is designed to improve classification robustness and accuracy, crucial for early and precise ASD identification. The methodology involves collecting a code-switched speech corpus, CoSAm, from children diagnosed with ASD and a matched control group. The dataset comprises 61 voice recordings from 30 children diagnosed with ASD and 31 from neurotypical children, aged between 3 and 13 years, resulting in a total of 159.75 minutes of voice recordings. The feature analysis focuses on MFCCs and extensive statistical attributes to capture speech pattern variability and complexity. The best model performance is achieved using a hierarchical fusion technique with an accuracy of 98.75% using a combination of acoustic and linguistic features first, followed by paralinguistic features in a hierarchical manner.

accuracy, asd, hierarchical feature fusion, (12 more...)

arXiv.org Artificial Intelligence

2407.14328

Country:

North America > United States (0.14)
Europe > United Kingdom (0.14)
Asia > India > Uttarakhand > Dehradun (0.04)
Asia > India > NCT > Delhi (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology > Autism (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
(2 more...)

Add feedback

AS-70: A Mandarin stuttered speech dataset for automatic speech recognition and stuttering event detection

Gong, Rong, Xue, Hongfei, Wang, Lezhi, Xu, Xin, Li, Qisheng, Xie, Lei, Bu, Hui, Wu, Shaomei, Zhou, Jiaming, Qin, Yong, Zhang, Binbin, Du, Jun, Bin, Jia, Li, Ming

arXiv.org Artificial IntelligenceJun-11-2024

The rapid advancements in speech technologies over the past two decades have led to human-level performance in tasks like automatic speech recognition (ASR) for fluent speech. However, the efficacy of these models diminishes when applied to atypical speech, such as stuttering. This paper introduces AS-70, the first publicly available Mandarin stuttered speech dataset, which stands out as the largest dataset in its category. Encompassing conversational and voice command reading speech, AS-70 includes verbatim manual transcription, rendering it suitable for various speech-related tasks. Furthermore, baseline systems are established, and experimental results are presented for ASR and stuttering event detection (SED) tasks. By incorporating this dataset into the model fine-tuning, significant improvements in the state-of-the-art ASR models, e.g., Whisper and Hubert, are observed, enhancing their inclusivity in addressing stuttered speech.

as-70 dataset, dataset, speech, (13 more...)

arXiv.org Artificial Intelligence

2406.07256

Country:

Asia > China (0.04)
Europe (0.04)

Genre: Research Report (0.50)

Industry:

Health & Medicine (0.93)
Information Technology (0.69)

Technology: Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)

Add feedback

NeuSpeech: Decode Neural signal as Speech

Yang, Yiqian, Duan, Yiqun, Zhang, Qiang, Jo, Hyejeong, Zhou, Jinni, Lee, Won Hee, Xu, Renjing, Xiong, Hui

arXiv.org Artificial IntelligenceJun-3-2024

Decoding language from brain dynamics is an important open direction in the realm of brain-computer interface (BCI), especially considering the rapid growth of large language models. Compared to invasive-based signals which require electrode implantation surgery, non-invasive neural signals (e.g. EEG, MEG) have attracted increasing attention considering their safety and generality. However, the exploration is not adequate in three aspects: 1) previous methods mainly focus on EEG but none of the previous works address this problem on MEG with better signal quality; 2) prior works have predominantly used $``teacher-forcing"$ during generative decoding, which is impractical; 3) prior works are mostly $``BART-based"$ not fully auto-regressive, which performs better in other sequence tasks. In this paper, we explore the brain-to-text translation of MEG signals in a speech-decoding formation. Here we are the first to investigate a cross-attention-based ``whisper" model for generating text directly from MEG signals without teacher forcing. Our model achieves impressive BLEU-1 scores of 60.30 and 52.89 without pretraining $\&$ teacher-forcing on two major datasets ($\textit{GWilliams}$ and $\textit{Schoffelen}$). This paper conducts a comprehensive review to understand how speech decoding formation performs on the neural decoding tasks, including pretraining initialization, training $\&$ evaluation set splitting, augmentation, and scaling law. Code is available at https://github.com/NeuSpeech/NeuSpeech1$.

dataset, neuspeech, predicted, (16 more...)

arXiv.org Artificial Intelligence

2403.01748

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > New York (0.04)
(6 more...)

Genre:

Research Report > Experimental Study (0.46)
Research Report > New Finding (0.46)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
(3 more...)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.48)

Add feedback

Deep Learning for real-time neural decoding of grasp

Viviani, Paolo, Gesmundo, Ilaria, Ghinato, Elios, Agudelo-Toro, Andres, Vercellino, Chiara, Vitali, Giacomo, Bergamasco, Letizia, Scionti, Alberto, Ghislieri, Marco, Agostini, Valentina, Terzo, Olivier, Scherberger, Hansjörg

arXiv.org Artificial IntelligenceNov-2-2023

Neural decoding involves correlating signals acquired from the brain to variables in the physical world like limb movement or robot control in Brain Machine Interfaces. In this context, this work starts from a specific pre-existing dataset of neural recordings from monkey motor cortex and presents a Deep Learning-based approach to the decoding of neural signals for grasp type classification. Specifically, we propose here an approach that exploits LSTM networks to classify time series containing neural data (i.e., spike trains) into classes representing the object being grasped. The main goal of the presented approach is to improve over state-of-the-art decoding accuracy without relying on any prior neuroscience knowledge, and leveraging only the capability of deep learning models to extract correlations from data. The paper presents the results achieved for the considered dataset and compares them with previous works on the same dataset, showing a significant improvement in classification accuracy, even if considering simulated real-time decoding.

accuracy, dataset, recording session, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-43427-3_23

2311.01061

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Italy > Piedmont > Turin Province > Turin (0.06)
Europe > Germany > Lower Saxony > Gottingen (0.04)

Genre:

Research Report (0.82)
Overview (0.68)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

A Model of the Neural Basis of the Rat's Sense of Direction

Neural Information Processing SystemsApr-6-2023, 18:32:47 GMT

In the last decade the outlines of the neural structures subserving the sense of direction have begun to emerge. Several investigations have shed light on the effects of vestibular input and visual input on the head direction representation. In this paper, a model is formulated of the neural mechanisms underlying the head direction system. The model is built out of simple ingredients, depending on nothing more complicated than connectional specificity, attractor dynamics, Hebbian learning, and sigmoidal nonlinearities, but it behaves in a sophisticated way and is consistent with most of the observed properties ofreal head direction cells. In addition it makes a number of predictions that ought to be testable by reasonably straightforward experiments.

artificial intelligence, head direction cell, machine learning, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.35)

Add feedback

Adaptive SpikeDeep-Classifier: Self-organizing and self-supervised machine learning algorithm for online spike sorting

Saif-ur-Rehman, Muhammad, Ali, Omair, Klaes, Christian, Iossifidis, Ioannis

arXiv.org Artificial IntelligenceMar-30-2023

Objective. Research on brain-computer interfaces (BCIs) is advancing towards rehabilitating severely disabled patients in the real world. Two key factors for successful decoding of user intentions are the size of implanted microelectrode arrays and a good online spike sorting algorithm. A small but dense microelectrode array with 3072 channels was recently developed for decoding user intentions. The process of spike sorting determines the spike activity (SA) of different sources (neurons) from recorded neural data. Unfortunately, current spike sorting algorithms are unable to handle the massively increasing amount of data from dense microelectrode arrays, making spike sorting a fragile component of the online BCI decoding framework. Approach. We proposed an adaptive and self-organized algorithm for online spike sorting, named Adaptive SpikeDeep-Classifier (Ada-SpikeDeepClassifier), which uses SpikeDeeptector for channel selection, an adaptive background activity rejector (Ada-BAR) for discarding background events, and an adaptive spike classifier (Ada-Spike classifier) for classifying the SA of different neural units. Results. Our algorithm outperformed our previously published SpikeDeep-Classifier and eight other spike sorting algorithms, as evaluated on a human dataset and a publicly available simulated dataset. Significance. The proposed algorithm is the first spike sorting algorithm that automatically learns the abrupt changes in the distribution of noise and SA. It is an artificial neural network-based algorithm that is well-suited for hardware implementation on neuromorphic chips that can be used for wearable invasive BCIs.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2304.01355

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
Europe > Germany (0.04)
(2 more...)

Genre: Research Report (0.65)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback