Goto

Collaborating Authors

 icn


Isolating Nonlinear Independent Sources in fMRI with $β$-TCVAE Models

arXiv.org Machine Learning

Learning meaningful latent representations from nonlinear fMRI data remains a fundamental challenge in neuroimaging analysis. Traditional independent component analysis, widely used due to its ability to estimate interpretable functional brain networks, relies on a linear mixing assumption for latent sources, limiting its ability to capture the inherently nonlinear and complex organization of brain dynamics. More recently, deep representation learning methods have emerged as promising alternatives for modeling nonlinear latent structure. However, many of these approaches have been evaluated primarily on simulated datasets or natural image benchmarks, with comparatively limited validation on real-world neuroimaging data such as fMRI. In this work, we are motivated by the $β$-TCVAE (Total Correlation Variational Autoencoder), a refinement of the $β$-VAE framework for learning latent representations without introducing additional hyperparameters during training. We adapt and modify this model to fMRI data for nonlinear source disentanglement, aiming to separate mixed spatial and temporal brain signals into interpretable components. We show that the $β$-TCVAE framework can recover meaningful nonlinear spatial components with biological relevance, including well-established intrinsic connectivity networks such as the default mode network. Furthermore, we evaluate the learned representations using functional network connectivity, showing that the latent structure captures coherent and interpretable brain organization patterns. This study provides a pilot investigation that bridges nonlinear representation learning and fMRI analysis.



Introspective Classification with Convolutional Nets

Neural Information Processing Systems

We propose introspective convolutional networks (ICN) that emphasize the importance of having convolutional neural networks empowered with generative capabilities. We employ a reclassification-by-synthesis algorithm to perform training using a formulation stemmed from the Bayes theory. Our ICN tries to iteratively: (1) synthesize pseudo-negative samples; and (2) enhance itself by improving the classification. The single CNN classifier learned is at the same time generative -- being able to directly synthesize new samples within its own discriminative model. We conduct experiments on benchmark datasets including MNIST, CIFAR-10, and SVHN using state-of-the-art CNN architectures, and observe improved classification results.


Correlation and Navigation in the Vocabulary Key Representation Space of Language Models

arXiv.org Artificial Intelligence

Language model (LM) decoding is based on the next-token prediction (NTP) probability distribution. For neural LMs (e.g., Transformer-based), NTP distribution is essentially a softmax-regularized dot product between an encoded input context (query) and fixed vocabulary representations (keys). In this paper, we study the effect of the key distribution on the NTP distribution, with a focus on whether the similarity between keys will trigger spurious correlations in NTP. Through knowledge-probing tasks, we show that in the NTP distribution, the few top-ranked tokens are typically accurate. However, the middle-ranked prediction is highly biased towards the tokens that are distributionally (not necessarily semantically) similar to these top ones. For instance, if "P" is predicted as the top-1 token, "A"-"Z" will all be ranked high in NTP, no matter whether they can lead to correct decoding results. This hurts the sampling diversity and makes the sampling of correct, long-tail results hopeless and noisy. We attempt to alleviate this issue via a novel in-context method that iteratively pushes the query representation away from explored regions. Specifically, we include the explored decoding results in the context and prompt the LM to generate something else, which encourages the LM to produce a query representation that has small dot products with explored keys. Experiments on knowledge-probing tasks show that our method leads to efficient navigation away from explored keys to correct new keys. We further extend our method to open-ended and chain-of-thought (for reasoning) generation. Experiment results show that ICN contributes to better generation diversity and improved self-consistency voting performance. Finally, we discuss potential training issues caused by the fixed key space together with the challenges and possible ways to address them in future research.


Multiscale Neuroimaging Features for the Identification of Medication Class and Non-Responders in Mood Disorder Treatment

arXiv.org Artificial Intelligence

In the clinical treatment of mood disorders, the complex behavioral symptoms presented by patients and variability of patient response to particular medication classes can create difficulties in providing fast and reliable treatment when standard diagnostic and prescription methods are used. Increasingly, the incorporation of physiological information such as neuroimaging scans and derivatives into the clinical process promises to alleviate some of the uncertainty surrounding this process. Particularly, if neural features can help to identify patients who may not respond to standard courses of anti-depressants or mood stabilizers, clinicians may elect to avoid lengthy and side-effect-laden treatments and seek out a different, more effective course that might otherwise not have been under consideration. Previously, approaches for the derivation of relevant neuroimaging features work at only one scale in the data, potentially limiting the depth of information available for clinical decision support. In this work, we show that the utilization of multi spatial scale neuroimaging features - particularly resting state functional networks and functional network connectivity measures - provide a rich and robust basis for the identification of relevant medication class and non-responders in the treatment of mood disorders. We demonstrate that the generated features, along with a novel approach for fast and automated feature selection, can support high accuracy rates in the identification of medication class and non-responders as well as the identification of novel, multi-scale biomarkers.


A Novel Representation to Improve Team Problem Solving in Real-Time

arXiv.org Artificial Intelligence

This paper proposes a novel representation to support computing metrics that help understanding and improving in real-time a team's behavior during problem solving in real-life. Even though teams are important in modern activities, there is little computing aid to improve their activity. The representation captures the different mental images developed, enhanced, and utilized during solving. A case study illustrates the representation.


ICNS

#artificialintelligence

This video was recorded with a drone flying in Telluride CO. The raw events from the sensor are shown on an exponentially decaying time surface (bottom left), with ON events in yellow and OFF events in blue. An event-based Self Organising Map of feature detectors was trained on the data using our FEAST algorithm. The top left shows the feature detectors after training. When a new event is received from the camera, it is added to the time-surface and a local region (11x11 pixels) of the time-surface around the event is sent to the feature detectors.


Introspective Classification with Convolutional Nets

Neural Information Processing Systems

We propose introspective convolutional networks (ICN) that emphasize the importance of having convolutional neural networks empowered with generative capabilities. We employ a reclassification-by-synthesis algorithm to perform training using a formulation stemmed from the Bayes theory. Our ICN tries to iteratively: (1) synthesize pseudo-negative samples; and (2) enhance itself by improving the classification. The single CNN classifier learned is at the same time generative --- being able to directly synthesize new samples within its own discriminative model. We conduct experiments on benchmark datasets including MNIST, CIFAR-10, and SVHN using state-of-the-art CNN architectures, and observe improved classification results.