AITopics | Flinker, Adeen

Collaborating Authors

Flinker, Adeen

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

AAD-LLM: Neural Attention-Driven Auditory Scene Understanding

Jiang, Xilin, Dindar, Sukru Samet, Choudhari, Vishal, Bickel, Stephan, Mehta, Ashesh, McKhann, Guy M, Friedman, Daniel, Flinker, Adeen, Mesgarani, Nima

arXiv.org Artificial IntelligenceMar-14-2025

However, human auditory perception is inherently selective: listeners focus on specific speakers while ignoring others in complex auditory scenes. Existing models do not incorporate this selectivity, limiting their ability to generate perceptionaligned responses. To address this, we introduce Intention-Informed Auditory Scene Understanding (II-ASU) and present Auditory Attention-Driven LLM (AAD-LLM), a prototype system that integrates brain signals to infer listener attention. AAD-LLM extends an auditory LLM by incorporating intracranial electroencephalography (iEEG) recordings to decode which speaker a listener is attending to and refine responses accordingly. The model first predicts the attended speaker from neural activity, then conditions response generation on this inferred attentional state. We evaluate AAD-LLM on speaker description, speech transcription and extraction, and question answering Figure 1: AAD-LLM is a brain-computer interface in multitalker scenarios, with both objective (BCI) for auditory scene understanding. It decodes neural and subjective ratings showing improved alignment signals to identify the attended speaker and integrates with listener intention. By taking a first this information into a language model, generating responses step toward intention-aware auditory AI, this that align with the listener's perceptual focus.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2502.16794

Country: North America > United States (0.93)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (0.87)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

The Temporal Structure of Language Processing in the Human Brain Corresponds to The Layered Hierarchy of Deep Language Models

Goldstein, Ariel, Ham, Eric, Schain, Mariano, Nastase, Samuel, Zada, Zaid, Dabush, Avigail, Aubrey, Bobbi, Gazula, Harshvardhan, Feder, Amir, Doyle, Werner K, Devore, Sasha, Dugan, Patricia, Friedman, Daniel, Reichart, Roi, Brenner, Michael, Hassidim, Avinatan, Devinsky, Orrin, Flinker, Adeen, Levy, Omer, Hasson, Uri

arXiv.org Artificial IntelligenceOct-10-2023

Deep Language Models (DLMs) provide a novel computational paradigm for understanding the mechanisms of natural language processing in the human brain. Unlike traditional psycholinguistic models, DLMs use layered sequences of continuous numerical vectors to represent words and context, allowing a plethora of emerging applications such as human-like text generation. In this paper we show evidence that the layered hierarchy of DLMs may be used to model the temporal dynamics of language comprehension in the brain by demonstrating a strong correlation between DLM layer depth and the time at which layers are most predictive of the human brain. Our ability to temporally resolve individual layers benefits from our use of electrocorticography (ECoG) data, which has a much higher temporal resolution than noninvasive methods like fMRI. Using ECoG, we record neural activity from participants listening to a 30-minute narrative while also feeding the same narrative to a high-performing DLM (GPT2-XL). We then extract contextual embeddings from the different layers of the DLM and use linear encoding models to predict neural activity. We first focus on the Inferior Frontal Gyrus (IFG, or Broca's area) and then extend our model to track the increasing temporal receptive window along the linguistic processing hierarchy from auditory to syntactic and semantic areas. Our results reveal a connection between human language processing and DLMs, with the DLM's layer-by-layer accumulation of contextual information mirroring the timing of neural activity in high-order language areas.

artificial intelligence, human brain correspond, natural language, (4 more...)

arXiv.org Artificial Intelligence

doi: 10.1101/2022.07.11.499562

2310.07106

Genre: Research Report (0.69)

Industry: Health & Medicine (0.53)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Reconstructing Speech Stimuli From Human Auditory Cortex Activity Using a WaveNet Approach

Wang, Ran, Wang, Yao, Flinker, Adeen

arXiv.org Machine LearningNov-7-2018

Abstract--The superior temporal gyrus (STG) region of cortex critically contributes to speech recognition. In this work, we show that a proposed deep network inspired by WaveNet, trained with limited available data, is able to reconstruct speech stimuli from STG intracranial recordings. We further investigate the impulse response of the fitted model for each recording electrode and observe phoneme level temporospectral tuning properties in some recorded area. This discovery is consistent with previous studies implicating the posterior STG (pSTG) in a phonetic representation of speech and provides detailed acoustic features that certain electrode sites possibly extract during speech recognition. Research studies on the superior temporal gyrus (STG) cortex area have shown that this area plays an important role in words and sentence recognition on a phonetic and prelexical stage [1]-[9].

deep learning, neural network, wavenet, (19 more...)

arXiv.org Machine Learning

1811.02694

Country: North America > United States > New York (0.29)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Speech (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback