Goto

Collaborating Authors

 Industry


Learning to Insert for Constructive Neural Vehicle Routing Solver

Neural Information Processing Systems

Neural Combinatorial Optimisation (NCO) is a promising learning-based approach for solving Vehicle Routing Problems (VRPs) without extensive manual design. While existing constructive NCO methods typically follow an appending-based paradigm that sequentially adds unvisited nodes to partial solutions, this rigid approach often leads to suboptimal results. To overcome this limitation, we explore the idea of the insertion-based paradigm and propose Learning to Construct with Insertion-based Paradigm (L2C-Insert), a novel learning-based method for constructive NCO. Unlike traditional approaches, L2C-Insert builds solutions by strategically inserting unvisited nodes at any valid position in the current partial solution, which can significantly enhance the flexibility and solution quality. The proposed framework introduces three key components: a novel model architecture for precise insertion position prediction, an efficient training scheme for model optimization, and an advanced inference technique that fully exploits the insertion paradigm's flexibility. Extensive experiments on both synthetic and real-world instances of the Travelling Salesman Problem (TSP) and Capacitated Vehicle Routing Problem (CVRP) demonstrate that L2C-Insert consistently achieves superior performance across various problem sizes.


Resounding Acoustic Fields with Reciprocity

Neural Information Processing Systems

Achieving immersive auditory experiences in virtual environments requires flexible sound modeling that supports dynamic source positions. In this paper, we introduce a task called resounding, which aims to estimate room impulse responses at arbitrary emitter location from a sparse set of measured emitter positions, analogous to the relighting problem in vision. We leverage the reciprocity property and introduce Versa, a physics-inspired approach to facilitating acoustic field learning. Our method creates physically valid samples with dense virtual emitter positions by exchanging emitter and listener poses. We also identify challenges in deploying reciprocity due to emitter/listener gain patterns and propose a self-supervised learning approach to address them. Results show that Versa substantially improve the performance of acoustic field learning on both simulated and real-world datasets across different metrics. Perceptual user studies show that Versa can greatly improve the immersive spatial sound experience. Code, dataset and demo videos are available on the project website.


NegoCollab: ACommon Representation Negotiation Approach for Heterogeneous Collaborative Perception

Neural Information Processing Systems

Collaborative perception improves task performance by expanding the perception range through information sharing among agents. Immutable heterogeneity poses a significant challenge in collaborative perception, as participating agents may employ different and fixed perception models. This leads to domain gaps in the intermediate features shared among agents, consequently degrading collaborative performance. Aligning the features of all agents to a common representation can eliminate domain gaps with low training cost. However, in existing methods, the common representation is designated as the representation of a specific agent, making it difficult for agents with significant domain discrepancies from this specific agent to achieve proper alignment.


Asymptotically Stable Quaternion-valued Hopfield-structured Neural Networks with Periodic Projection-based Supervised Learning Rules

Neural Information Processing Systems

Motivated by the geometric advantages of quaternions in representing rotations and postures, we propose a quaternion-valued supervised learning Hopfield-structured neural network (QSHNN) with a fully connected structure inspired by the classic Hopfield neural network (HNN). Starting from a continuous-time dynamical model of HNNs, we extend the formulation to the quaternionic domain and establish the existence and uniqueness of fixed points with asymptotic stability. For the learning rules, we introduce a periodic projection strategy that modifies standard gradient descent by periodically projecting each 4 4block of the weight matrix onto the closest quaternionic structure in the least-squares sense. This approach preserves both convergence and quaternionic consistency throughout training. Benefiting from this rigorous mathematical foundation, the experimental model implementation achieves high accuracy, fast convergence, and strong reliability across randomly generated target sets. Moreover, the evolution trajectories of the QSHNN exhibit well-bounded curvature, i.e., sufficient smoothness, which is crucial for applications such as control systems or path planning modules in robotic arms, where joint postures are parameterized by quaternion neurons. Beyond these application scenarios, the proposed model offers a practical implementation framework and a general mathematical methodology for designing neural networks under hypercomplex or non-commutative algebraic structures.


7dd74dcef03c8f88a58d18a9d49d7a10-Paper-Conference.pdf

Neural Information Processing Systems

Vision transformers are ever larger, more accurate, and more expensive to compute. The expense is even more extreme at high resolution as the number of tokens grows quadratically with the image size. We turn to adaptive computation to cope with this cost by learning to predict where to compute. Our LookWhere method divides the computation between a low-resolution selector and a high-resolution extractor without ever processing the full high-resolution input. We jointly pretrain the selector and extractor without task supervision by distillation from a selfsupervised teacher, in effect, learning where and what to compute simultaneously. Unlike prior token reduction methods, which pay to save by pruning alreadycomputed tokens, and prior token selection methods, which require complex and expensive per-task optimization, LookWhere economically and accurately selects and extracts transferrable representations of images. We show that LookWhere excels at sparse recognition on high-resolution inputs (Traffic Signs), maintaining accuracy while reducing FLOPs by up to 34 and time by 6 . It also excels at standard recognition tasks that are global (ImageNet classification) or local (ADE20K segmentation), improving accuracy while reducing time by 1.36 .


KAIROS: Scalable Model-Agnostic Data Valuation

Neural Information Processing Systems

Data valuation techniques quantify each training example's contribution to model performance, providing a principled basis for data cleaning, acquisition, and selection. Existing valuation methods remain inadequate: model-based techniques depend on a single fitted model and inherit its biases, while algorithm-based approaches like Data Shapley scale poorly due to their need to train multiple models. Recent work has proposed model-agnostic alternatives based on Wasserstein distance between the training set and a clean reference set, but exact computation is expensive and approximations often misrank examples. We introduce KAIROS, a model-agnostic framework that values examples by their contribution to the Maximum Mean Discrepancy (MMD) between the training set and a clean reference distribution. Unlike Wasserstein methods, MMD admits a closed-form solution that requires no approximations and is scalable to large datasets. Additionally, KAIROS enables efficient online valuation: adding a new batch of m examples requires only O(mN)computation to update all scores, compared to O(N2)in prior work where N is the training set size. Empirical evaluations on noise, mislabeling, and poisoning benchmarks show that KAIROS consistently outperforms state-of-the-art baselines in both accuracy and runtime. On ImageNet, KAIROS achieves up to 15 speedup over the fastest baseline while maintaining superior data valuation quality. Our results demonstrate that model-agnostic methods can match or exceed model-based approaches in performance while scaling to large datasets.


MGE-LDM: Joint Latent Diffusion for Simultaneous Music Generation and Source Extraction

Neural Information Processing Systems

Unlike prior approaches constrained to fixed instrument classes, MGE-LDM learns a joint distribution over full mixtures, submixtures, and individual stems within a single compact latent diffusion model. At inference, MGE-LDM enables (1) complete mixture generation, (2) partial generation (i.e., source imputation), and (3) textconditioned extraction of arbitrary sources. By formulating both separation and imputation as conditional inpainting tasks in the latent space, our approach supports flexible, class-agnostic manipulation of arbitrary instrument sources. Notably, MGE-LDM can be trained jointly across heterogeneous multi-track datasets (e.g., Slakh2100, MUSDB18, MoisesDB) without relying on predefined instrument categories. Audio samples are available at our project page .


T-norm Selection for Object Detection in Autonomous Driving with Logical Constraints

Neural Information Processing Systems

Integrating logical constraints into object detection models for autonomous driving (AD) is a promising way to enhance their compliance to rules and thus increase the safety of the system. In this, t-norms have been utilized to calculate the constrained loss, i.e., the violations of logical constraints as losses. While prior works have statically selected few t-norms, we conduct an extensive experimental study to identify the most effective choices, as suboptimal t-norms can lead to undesired model behavior. For this, we present MOD-ECL, a neurosymbolic framework that implements a wide range of t-norms and can use them in an adaptive manner, with an algorithm that selects well-performing t-norms during training and a scheduler that regulates the impact of the constrained loss. We evaluate its effectiveness on the ROAD-R and ROAD-Waymo-R datasets for object detection in AD with attached common-sense constraints. Our results show that careful selection of parameters is crucial for good behavior of the constrained loss and that our framework allows us to obtain not only lower constraint violation but in some cases also an increase in detection performance. Furthermore, our methods allow fine control over the tradeoff between accuracy and violation.1


Put CASH on Bandits: AMax K-Armed Problem for Automated Machine Learning

Neural Information Processing Systems

The Combined Algorithm Selection and Hyperparameter optimization (CASH) is a challenging resource allocation problem in the field of AutoML. We propose MaxUCB, a max k-armed bandit method to trade off exploring different model classes and conducting hyperparameter optimization. MaxUCB is specifically designed for the light-tailed and bounded reward distributions arising in this setting and, thus, provides an efficient alternative compared to classic max k-armed bandit methods assuming heavy-tailed reward distributions. We theoretically and empirically evaluate our method on four standard AutoML benchmarks demonstrating superior performance over prior approaches.


How the Peter Thiel-Linked Dialog Club Secretly Ranks Its Members

WIRED

Leaked files show the invite-only network grades members by their money and fame, shaping who's in, who's out, and who pays. Dialog, the private network cofounded by Peter Thiel, grades its event attendees on a hidden scale, ranking them by wealth and fame, tracking their relationships, and using algorithms to help decide who they should meet, who they should sit with, and who no longer belongs, WIRED has learned. The records are part of a trove of internal data received by WIRED from a confidential source, containing the personal information of nearly 200 prominent people scheduled to attend the group's annual retreat this summer. The data includes home addresses, private phone numbers and email accounts, dates of birth, photos, and emergency contacts, as well as food allergies and the political leanings volunteered by some members. The records are distinct from a list of people affiliated with Dialog that was left exposed on the organization's website and has been circulating online since earlier this week--a looser directory that appears to include nonmembers, such as Maryland governor Wes Moore, a former event speaker, and other outside guests who passed through Dialog's orbit, in some cases years ago.