Goto

Collaborating Authors

 online network


Orthogonal Contrastive Learning for Multi-Representation fMRI Analysis

Neural Information Processing Systems

Task-based functional magnetic resonance imaging (fMRI) provides invaluable insights into human cognition but faces critical hurdles--low signal-to-noise ratio, high dimensionality, limited sample sizes, and costly data acquisition--that are amplified when integrating datasets across subjects or sites. This paper introduces orthogonal contrastive learning (OCL), a unified multi-representation framework for multi-subject fMRI analysis that aligns neural responses without requiring temporal preprocessing or uniform time-series lengths across subjects or sites. OCL employs two identical encoders: an online network trained with a contrastive loss that pulls together same-stimulus responses and pushes apart different-stimulus responses, and a target network whose weights track the online network via exponential moving average to stabilize learning. Each OCL network layer combines QR decomposition for orthogonal feature extraction, locality-sensitive hashing (LSH) to produce compact subject-specific signatures, positional encoding to embed temporal structure alongside spatial features, and a transformer encoder to generate discriminative, stimulus-aligned embeddings. We further enhance OCL with an unsupervised pretraining stage on fMRI-like synthetic data and demonstrate a transfer-learning workflow for multi-site studies. Across extensive experiments on multi-subject and multi-site fMRI benchmarks, OCL consistently outperforms state-of-the-art alignment and analysis methods in both representation quality and downstream classification accuracy.



Bootstrap Your Own Latent - A New Approach to Self-Supervised Learning

Neural Information Processing Systems

We introduce Bootstrap Your Own Latent (BYOL), a new approach to self-supervised image representation learning. BYOL relies on two neural networks, referred to as online and target networks, that interact and learn from each other. From an augmented view of an image, we train the online network to predict the target network representation of the same image under a different augmented view. At the same time, we update the target network with a slow-moving average of the online network. While state-of-the art methods intrinsically rely on negative pairs, BYOL achieves a new state of the art without them. BYOL reaches 74.3% top-1 classification accuracy on ImageNet using the standard linear evaluation protocol with a standard ResNet-50 architecture and 79.6% with a larger ResNet. We also show that BYOL performs on par or better than the current state of the art on both transfer and semi-supervised benchmarks.


Artificial Intelligence-Enabled Spirometry for Early Detection of Right Heart Failure

arXiv.org Artificial Intelligence

Right heart failure (RHF) is a disease characterized by abnormalities in the structure or function of the right ventricle (RV), which is associated with high morbidity and mortality. Lung disease often causes increased right ventricular load, leading to RHF. Therefore, it is very important to screen out patients with cor pulmonale who develop RHF from people with underlying lung diseases. In this work, we propose a self-supervised representation learning method to early detecting RHF from patients with cor pulmonale, which uses spirogram time series to predict patients with RHF at an early stage. The proposed model is divided into two stages. The first stage is the self-supervised representation learning-based spirogram embedding (SLSE) network training process, where the encoder of the Variational autoencoder (VAE-encoder) learns a robust low-dimensional representation of the spirogram time series from the data-augmented unlabeled data. Second, this low-dimensional representation is fused with demographic information and fed into a CatBoost classifier for the downstream RHF prediction task. Trained and tested on a carefully selected subset of 26,617 individuals from the UK Biobank, our model achieved an AUROC of 0.7501 in detecting RHF, demonstrating strong population-level distinction ability. We further evaluated the model on high-risk clinical subgroups, achieving AUROC values of 0.8194 on a test set of 74 patients with chronic kidney disease (CKD) and 0.8413 on a set of 64 patients with valvular heart disease (VHD). These results highlight the model's potential utility in predicting RHF among clinically elevated-risk populations. In conclusion, this study presents a self-supervised representation learning approach combining spirogram time series and demographic data, demonstrating promising potential for early RHF detection in clinical practice.


Faster Deep Reinforcement Learning with Slower Online Network

Neural Information Processing Systems

Deep reinforcement learning algorithms often use two networks for value function optimization: an online network, and a target network that tracks the online network with some delay. Using two separate networks enables the agent to hedge against issues that arise when performing bootstrapping.


AFP developing AI tool to decode gen Z slang amid warning about 'crimefluencers' hunting girls

The Guardian

Federal police say they have identified 59 alleged offenders as being in these online networks and have made an unspecified number of arrests. Federal police say they have identified 59 alleged offenders as being in these online networks and have made an unspecified number of arrests. Australian federal police will develop an AI tool to decode gen Z and Alpha slang and emojis in an effort to crackdown on sadistic online exploitation and "crimefluencers". The AFP commissioner, Krissy Barrett, used a speech at the National Press Club on Wednesday to warn of the rise of online crime networks of young boys and men who are targeting vulnerable teen and preteen girls. The newly appointed chief outlined how the perpetrators, who are overwhelmingly from English-speaking backgrounds, were grooming victims and then forcing them to "perform serious acts of violence on themselves, their siblings, others or their pets".


Use the Online Network If You Can: Towards Fast and Stable Reinforcement Learning

arXiv.org Artificial Intelligence

The use of target networks is a popular approach for estimating value functions in deep Reinforcement Learning (RL). While effective, the target network remains a compromise solution that preserves stability at the cost of slowly moving targets, thus delaying learning. Conversely, using the online network as a bootstrapped target is intuitively appealing, albeit well-known to lead to unstable learning. In this work, we aim to obtain the best out of both worlds by introducing a novel update rule that computes the target using the MINimum estimate between the Target and Online network, giving rise to our method, MINTO. Through this simple, yet effective modification, we show that MINTO enables faster and stable value function learning, by mitigating the potential overestimation bias of using the online network for bootstrapping. Notably, MINTO can be seamlessly integrated into a wide range of value-based and actor-critic algorithms with a negligible cost. We evaluate MINTO extensively across diverse benchmarks, spanning online and of-fline RL, as well as discrete and continuous action spaces. Across all benchmarks, MINTO consistently improves performance, demonstrating its broad applicability and effectiveness. Reinforcement Learning (RL) has demonstrated exceptional performance and achieved major breakthroughs across a diverse spectrum of decision-making challenges. Noteworthy applications include learning complex locomotion skills (Haarnoja et al., 2018b; Rudin et al., 2022) and enabling sophisticated, real-world capabilities such as robotic manipulation (Andrychowicz et al., 2020; Lu et al., 2025). The foundation of this success lies primarily in Deep RL, initiated by the introduction of the Deep Q-Network (DQN) (Mnih et al., 2013), which marked the first successful application of deep neural networks in RL. To make that happen, Mnih et al. (2013) introduce various techniques to mitigate mainly the deadly triad issue (V an Hasselt et al., 2018) due to the usage of function approximators, off-policy data, and target bootstrapping.


Bridging the Performance Gap Between Target-Free and Target-Based Reinforcement Learning

arXiv.org Artificial Intelligence

The use of target networks in deep reinforcement learning is a widely popular solution to mitigate the brittleness of semi-gradient approaches and stabilize learning. However, target networks notoriously require additional memory and delay the propagation of Bellman updates compared to an ideal target-free approach. In this work, we step out of the binary choice between target-free and target-based algorithms. We introduce a new method that uses a copy of the last linear layer of the online network as a target network, while sharing the remaining parameters with the up-to-date online network. This simple modification enables us to keep the target-free's low-memory footprint while leveraging the target-based literature. We find that combining our approach with the concept of iterated Q-learning, which consists of learning consecutive Bellman updates in parallel, helps improve the sample-efficiency of target-free approaches. Our proposed method, iterated Shared Q-Learning (iS-QL), bridges the performance gap between target-free and target-based approaches across various problems, while using a single Q-network, thus being a step forward towards resource-efficient reinforcement learning algorithms.