Goto

Collaborating Authors

 alignment score


Bridging Critical Gaps in Convergent Learning: How Representational Alignment Evolves Across Layers, Training, and Distribution Shifts

Neural Information Processing Systems

Understanding convergent learning--the degree to which independently trained neural systems--whether multiple artificial networks or brains and models--arrive at similar internal representations--is crucial for both neuroscience and AI. Yet, the literature remains narrow in scope--typically examining just a handful of models with one dataset, relying on one alignment metric, and evaluating networks at a single post-training checkpoint. We present a large-scale audit of convergent learning, spanning dozens of vision models and thousands of layer-pair comparisons, to close these long-standing gaps.


Adaptive Preference Arithmetic: Modeling Dynamic Preference Strengths for LLMAgent Personalization

Neural Information Processing Systems

As large language models (LLMs) are increasingly used as personalized user assistants, effectively adapting to users' evolving preferences is critical for delivering high-quality personalized responses. While user preferences are often stable in content, their relative strengths shift over time due to changing goals and contexts. Therefore, modeling these dynamic preference strengths can enable finer-grained personalization. However, current methods face two major challenges: (i) limited user feedback makes it difficult to estimate preference strengths accurately, and (ii) natural language ambiguity limits the controllability of preference-guided generation. To address these issues, we propose AdaPA-Agent, a LLM-agent personalization framework that models dynamic preference strengths via Adaptive Preference Arithmetic. First, instead of requiring additional user feedback, AdaPA-Agent employs an alignment-based strength estimation module to estimate the strength of user preferences from the existing user-agent interaction. Then, it guides controllable personalized generation by linearly combining next-token distributions, weighted by the estimated strengths of individual preferences. Experiments on two personalization tasks-conversational recommendation and personalized web interaction-demonstrate that AdaPA-Agent better aligning with users' changing intents, and has achieved over 18.9% and 14.2% improvements compared to ReAct, the widely-used agent framework.


Generative Model Inversion Through the Lens of the Manifold Hypothesis

Neural Information Processing Systems

Model inversion attacks (MIAs) aim to reconstruct class-representative samples from trained models. Recent generative MIAs utilize generative adversarial networks to learn image priors that guide the inversion process, yielding reconstructions with high visual quality and strong fidelity to the private training data. To explore the reason behind their effectiveness, we begin by examining the gradients of inversion loss w.r.t.


Guiding Cross-Modal Representations with MLLM Priors via Preference Alignment

Neural Information Processing Systems

Despite Contrastive Language-Image Pre-training (CLIP)'s remarkable capability to retrieve content across modalities, a substantial modality gap persists in its feature space. Intriguingly, we discover that off-the-shelf MLLMs (Multimodal Large Language Models) demonstrate powerful inherent modality alignment properties. While recent MLLM-based retrievers with unified architectures partially mitigate this gap, their reliance on coarse modality alignment mechanisms fundamentally limits their potential. In this work, We introduce MAPLE (Modality-Aligned Preference Learning for Embeddings), a novel framework that leverages the finegrained alignment priors inherent in MLLM to guide cross-modal representation learning. MAPLE formulates the learning process as reinforcement learning with two key components: (1) Automatic preference data construction using off-theshelf MLLM, and (2) a new Relative Preference Alignment (RPA) loss, which adapts Direct Preference Optimization (DPO) to the embedding learning setting. Experimental results show that our preference-guided alignment achieves substantial gains in fine-grained cross-modal retrieval, underscoring its effectiveness in handling nuanced semantic distinctions.


Scaling and context steer LLMs along the same computational path as the human brain

Neural Information Processing Systems

Recent studies suggest that the representations learned by large language models (LLMs) are partially aligned to those of the human brain. However, whether and why this alignment score arises from a similar sequence of computations remains elusive. In this study, we explore this question by examining temporally-resolved brain signals of participants listening to 10 hours of an audiobook. We study these neural dynamics jointly with a benchmark encompassing 17 LLMs varying in size and architecture type. Our analyses confirm that LLMs and the brain generate representations in a similar order: specifically, activations in the initial layers of LLMs tend to best align with early brain responses, while the deeper layers of LLMs tend to best align with later brain responses. This brain-LLM alignment is consistent across transformers and recurrent architectures. However, its emergence depends on both model size and context length.


ConceptScope: Characterizing Dataset Bias via Disentangled Visual Concepts

Neural Information Processing Systems

Dataset bias, where data points are skewed to certain concepts, is ubiquitous in machine learning datasets. Yet, systematically identifying these biases is challenging without costly, fine-grained attribute annotations. We present ConceptScope, a scalable and automated framework for analyzing visual datasets by discovering and quantifying human-interpretable concepts using Sparse Autoencoders trained on representations from vision foundation models. ConceptScope categorizes concepts into target, context, and bias types based on their semantic relevance and statistical correlation to class labels, enabling class-level dataset characterization, bias identification, and robustness evaluation through concept-based subgrouping.


FlexPlanner: Flexible 3D Floorplanning via Deep Reinforcement Learning in Hybrid Action Space with Multi-Modality Representation

Neural Information Processing Systems

In the Integrated Circuit (IC) design flow, floorplanning (FP) determines the position and shape of each block. Serving as a prototype for downstream tasks, it is critical and establishes the upper bound of the final PPA (Power, Performance, Area). However, with the emergence of 3D IC with stacked layers, existing methods are not flexible enough to handle the versatile constraints. Besides, they typically face difficulties in aligning the cross-die modules in 3D ICs due to their heuristic representations, which could potentially result in severe data transfer failures. To address these issues, we propose FlexPlanner, a flexible learning-based method in hybrid action space with multi-modality representation to simultaneously handle position, aspect ratio, and alignment of blocks. To our best knowledge, FlexPlanner is the first learning-based approach to discard heuristic-based search in the 3D FP task. Thus, the solution space is not limited by the heuristic floorplanning representation, allowing for significant improvements in both wirelength and alignment scores. Specifically, FlexPlanner models 3D FP based on multi-modalities, including vision, graph, and sequence. To address the non-trivial heuristic-dependent issue, we design a sophisticated policy network with hybrid action space and asynchronous layer decision mechanism that allow for determining the versatile properties of each block.