Not enough data to create a plot.
Try a different view from the menu above.
63ba665e01f39233674426ba36d6e177-Paper-Conference.pdf
Humans judge perceptual similarity according to diverse visual attributes, including scene layout, subject location, and camera pose. Existing vision models understand a wide range of semantic abstractions but improperly weigh these attributes and thus make inferences misaligned with human perception. While vision representations have previously benefited from alignment in contexts like image generation, the utility of perceptually aligned representations in general-purpose settings remains unclear. Here, we investigate how aligning vision representations to human perceptual judgments impacts their usability across diverse vision tasks. We finetune state-of-the-art models on human similarity judgments for image triplets and evaluate them across standard benchmarks. We find that perceptual alignment yields representations that improve upon the original backbones across many tasks, including counting, segmentation, depth estimation, instance retrieval, and retrieval-augmented generation, while deteriorating performance on natural classification. Performance is widely preserved on other tasks, including specialized out-of-distribution domains such as in medical imaging and 3D environment frames. Our results suggest that injecting an inductive bias about human perceptual knowledge into vision models can contribute to better representations.
Honor of Kings Arena: an Environment for Generalization in Competitive Reinforcement Learning
This paper introduces Honor of Kings Arena, a reinforcement learning (RL) environment based on Honor of Kings, one of the world's most popular games at present. Compared to other environments studied in most previous work, ours presents new generalization challenges for competitive reinforcement learning. It is a multiagent problem with one agent competing against its opponent; and it requires the generalization ability as it has diverse targets to control and diverse opponents to compete with. We describe the observation, action, and reward specifications for the Honor of Kings domain and provide an open-source Python-based interface for communicating with the game engine. We provide twenty target heroes with a variety of tasks in Honor of Kings Arena and present initial baseline results for RL-based methods with feasible computing resources. Finally, we showcase the generalization challenges imposed by Honor of Kings Arena and possible remedies to the challenges.
START: A Generalized State Space Model with Saliency-Driven Token-Aware Transformation
Domain Generalization (DG) aims to enable models to generalize to unseen target domains by learning from multiple source domains. Existing DG methods primarily rely on convolutional neural networks (CNNs), which inherently learn texture biases due to their limited receptive fields, making them prone to overfitting source domains. While some works have introduced transformer-based methods (ViTs) for DG to leverage the global receptive field, these methods incur high computational costs due to the quadratic complexity of self-attention. Recently, advanced state space models (SSMs), represented by Mamba, have shown promising results in supervised learning tasks by achieving linear complexity in sequence length during training and fast RNN-like computation during inference. Inspired by this, we investigate the generalization ability of the Mamba model under domain shifts and find that input-dependent matrices within SSMs could accumulate and amplify domain-specific features, thus hindering model generalization. To address this issue, we propose a novel SSM-based architecture with saliency-based token-aware transformation (namely START), which achieves state-of-the-art (SOTA) performances and offers a competitive alternative to CNNs and ViTs. Our START can selectively perturb and suppress domain-specific features in salient tokens within the input-dependent matrices of SSMs, thus effectively reducing the discrepancy between different domains. Extensive experiments on five benchmarks demonstrate that START outperforms existing SOTA DG methods with efficient linear complexity.
Unbalanced Optimal Transport through Non-negative Penalized Linear Regression
This paper addresses the problem of Unbalanced Optimal Transport (UOT) in which the marginal conditions are relaxed (using weighted penalties in lieu of equality) and no additional regularization is enforced on the OT plan. In this context, we show that the corresponding optimization problem can be reformulated as a non-negative penalized linear regression problem. This reformulation allows us to propose novel algorithms inspired from inverse problems and nonnegative matrix factorization. In particular, we consider majorization-minimization which leads in our setting to efficient multiplicative updates for a variety of penalties. Furthermore, we derive for the first time an efficient algorithm to compute the regularization path of UOT with quadratic penalties. The proposed algorithm provides a continuity of piece-wise linear OT plans converging to the solution of balanced OT (corresponding to infinite penalty weights). We perform several numerical experiments on simulated and real data illustrating the new algorithms, and provide a detailed discussion about more sophisticated optimization tools that can further be used to solve OT problems thanks to our reformulation.
Bronny James explains what fuels him throughout tumultuous rookie season: 'People think I'm a f---ing robot'
Paul Pierce explains how LeBron's absence has actually been good for the Lakers. Los Angeles Lakers shooting guard Bronny James has been the center of debate from the moment he was drafted in June. The 20-year-old said he tries to filter it all out, but he sees it all. "My first thought about everything is I always try to just let it go through one ear and out the other, put my head down and come to work and be positive every day. I see everything that people are saying, and people think, like, I'm a f---ing robot, like I don't have any feelings or emotions," James said via The Athletic.
Appendix A Relation to PCA q, and additionally fpxq " 1 ry
PCA: We show that under strict conditions, the PCA vectors applied on an intermediate layer where the principle components are used as concept vectors, maximizes the completeness score. We note that the assumptions for this proposition are extremely stringent, and may not hold in general. When the isometry and other assumptions do not hold, PCA no longer maximizes the completeness score as the lowest reconstruction in the intermediate layer do not imply the highest prediction accuracy in the output. In fact, DNNs are shown to be very sensitive to small perturbations in the input [Narodytska and Kasiviswanathan, 2017] - they can yield very different outputs although the difference in the input is small (and often perceptually hard to recognize to humans). Thus, even though the reconstruction loss between two inputs are low at an intermediate layer, subsequent deep nonlinear processing may cause them to diverge significantly.