Goto

Collaborating Authors

 completion


APML: Adaptive Probabilistic Matching Loss for Robust 3DPoint Cloud Reconstruction

Neural Information Processing Systems

Training deep learning models for point cloud prediction tasks such as shape completion and generation depends critically on loss functions that measure discrepancies between predicted and ground-truth point sets. Commonly used functions such as Chamfer Distance (CD), HyperCD, InfoCD and Density-aware CD rely on nearest-neighbor assignments, which often induce many-to-one correspondences, leading to point congestion in dense regions and poor coverage in sparse regions. These losses also involve non-differentiable operations due to index selection, which may affect gradient-based optimization. Earth Mover Distance (EMD) enforces one-to-one correspondences and captures structural similarity more effectively, but its cubic computational complexity limits its practical use. We propose the Adaptive Probabilistic Matching Loss (APML), a fully differentiable approximation of one-to-one matching that leverages Sinkhorn iterations on a temperature-scaled similarity matrix derived from pairwise distances. We analytically compute the temperature to guarantee a minimum assignment probability, eliminating manual tuning. APML achieves near-quadratic runtime, comparable to Chamfer-based losses, and avoids non-differentiable operations.


Quantile Reward Policy Optimization: Alignment with Pointwise Regression and Exact Partition Functions

Neural Information Processing Systems

Aligning large language models with pointwise absolute rewards has so far required online, on-policy algorithms such as PPO and GRPO. In contrast, simpler methods that can leverage offline or off-policy data, such as DPO and REBEL, are limited to learning from preference pairs or relative signals. To bridge this gap, we introduce Quantile Reward Policy Optimization (QRPO), which learns from pointwise absolute rewards while preserving the simplicity and offline applicability of DPO-like methods. QRPO uses quantile rewards to enable regression to the closed-form solution of the KL-regularized RL objective. This reward yields an analytically tractable partition function, removing the need for relative signals to cancel this term. Moreover, QRPO scales with increased compute to estimate quantile rewards, opening a new dimension for pre-computation scaling.



CXReasonBench: ABenchmark for Evaluating Structured Diagnostic Reasoning in Chest X-rays

Neural Information Processing Systems

Recent progress in Large Vision-Language Models (LVLMs) has enabled promising applications in medical tasks, such as report generation and visual question answering. However, existing benchmarks focus mainly on the final diagnostic answer, offering limited insight into whether models engage in clinically meaningful reasoning. To address this, we present CheXStruct and CXReasonBench, a structured pipeline and benchmark built on the publicly available MIMIC-CXR-JPG dataset. CheXStruct automatically derives a sequence of intermediate reasoning steps directly from chest X-rays, such as segmenting anatomical regions, deriving anatomical landmarks and diagnostic measurements, computing diagnostic indices, and applying clinical thresholds. CXReasonBench leverages this pipeline to evaluate whether models can perform clinically valid reasoning steps and to what extent they can learn from structured guidance, enabling fine-grained and transparent assessment of diagnostic reasoning. The benchmark comprises 18,988 QA pairs across 12 diagnostic tasks and 1,200 cases, each paired with up to 4 visual inputs, and supports multi-path, multi-stage evaluation including visual grounding via anatomical region selection and diagnostic measurements. Even the strongest of 12 evaluated LVLMs struggle with structured reasoning and generalization, often failing to link abstract knowledge with anatomically grounded visual interpretation.


Fast exact recovery of noisy matrix from few entries: the infinity norm approach

Neural Information Processing Systems

The matrix recovery (completion) problem, a central problem in data science, involves recovering a matrix Afrom a relatively small random set of entries. While such a task is generally impossible, it has been shown that one can recover A exactly in polynomial time, with high probability, under three basic and necessary assumptions: (1) the rank of A is very small compared to its dimensions (low rank), (2) A has delocalized singular vectors (incoherence), and (3) the sample size is sufficiently large. Various algorithms address this task, including convex optimization by Candes, Recht, and Tao (2009), alternating projection by Hardt and Wooters (2014), and low-rank approximation with gradient descent by Keshavan, Montanari, and Oh (2009, 2010). In applications, Candes and Plan (2009) noted that it is more realistic to assume noisy observations. In such cases, the above approaches provide approximate recovery with small root mean square error, which is difficult to convert into exact recovery.


RGNMR: AGauss-Newton method for robust matrix completion with theoretical guarantees

Neural Information Processing Systems

Recovering a low rank matrix from a subset of its entries, some of which may be corrupted, is known as the robust matrix completion (RMC) problem. Existing RMC methods have several limitations: they require a relatively large number of observed entries; they may fail under overparametrization, when their assumed rank is higher than the correct one; and many of them fail to recover even mildly ill-conditioned matrices. In this paper we propose a novel RMC method, denoted RGNMR, which overcomes these limitations. RGNMRis a simple factorization-based iterative algorithm, which combines a Gauss-Newton linearization with removal of entries suspected to be outliers. On the theoretical front, we prove that under suitable assumptions, RGNMR is guaranteed exact recovery of the underlying low rank matrix. Our theoretical results improve upon the best currently known for factorization-based methods. On the empirical front, we show via several simulations the advantages of RGNMR over existing RMC methods, and in particular its ability to handle a small number of observed entries, overparameterization of the rank and ill-conditioned matrices. In addition, we propose a novel scheme for estimating the number of corrupted entries. This scheme may be used by other RMC methods that require as input the number of corrupted entries.


DuetGraph: Coarse-to-Fine Knowledge Graph Reasoning with Dual-Pathway Global-Local Fusion

Neural Information Processing Systems

Knowledge graphs (KGs) are vital for enabling knowledge reasoning across various domains. Recent KG reasoning methods that integrate both global and local information have achieved promising results. However, existing methods often suffer from score over-smoothing, which blurs the distinction between correct and incorrect answers and hinders reasoning effectiveness. To address this, we propose DuetGraph, a coarse-to-fine KG reasoning mechanism with dual-pathway global-local fusion. DuetGraph tackles over-smoothing by segregating--rather than stacking--the processing of local (via message passing) and global (via attention) information into two distinct pathways, preventing mutual interference and preserving representational discrimination. In addition, DuetGraph introduces a coarse-to-fine optimization, which partitions entities into high-and low-score subsets. This strategy narrows the candidate space and sharpens the score gap between the two subsets, which alleviates over-smoothing and enhances inference quality. Extensive experiments on various datasets demonstrate that DuetGraph achieves state-of-the-art (SOTA) performance, with up to an 8.7% improvement in reasoning quality and a 1.8 acceleration in training efficiency.


PointMAC: Meta-Learned Adaptation for Robust Test-Time Point Cloud Completion

Neural Information Processing Systems

Point cloud completion is essential for robust 3D perception in safety-critical applications such as robotics and augmented reality. However, existing models perform static inference and rely heavily on inductive biases learned during training, limiting their ability to adapt to novel structural patterns and sensor-induced distortions at test time. To address this limitation, we propose PointMAC, a meta-learned framework for robust test-time adaptation in point cloud completion. It enables sample-specific refinement without requiring additional supervision. Our method optimizes the completion model under two self-supervised auxiliary objectives that simulate structural and sensor-level incompleteness.


Online Functional Tensor Decomposition via Continual Learning for Streaming Data Completion

Neural Information Processing Systems

Online tensor decompositions are powerful and proven techniques that address the challenges in processing high-velocity streaming tensor data, such as traffic flow and weather system. The main aim of this work is to propose a novel online functional tensor decomposition (OFTD) framework, which represents a spatialtemporal continuous function using the CP tensor decomposition parameterized by coordinate-based implicit neural representations (INRs). The INRs allow for natural characterization of continually expanded streaming data by simply adding new coordinates into the network. Particularly, our method transforms the classical online tensor decomposition algorithm into a more dynamic continual learning paradigm of updating the INR weights to fit the new data without forgetting the previous tensor knowledge. To this end, we introduce a long-tail memory replay method that adapts to the local continuity property of INR. Extensive experiments for streaming tensor completion using traffic, weather, user-item, and video data verify the effectiveness of the OFTD approach for streaming data analysis. This endeavor serves as a pivotal inspiration for future research to connect classical online tensor tools with continual learning paradigms to better explore knowledge underlying streaming tensor data.


Ref. ImagesOursGTPaint-by-Example Target Images

Neural Information Processing Systems

Reference-driven image completion, which restores missing regions in a target view using additional images, is particularly challenging when the target view differs significantly from the references. Existing generative methods rely solely on diffusion priors and, without geometric cues such as camera pose or depth, often produce misaligned or implausible content. We propose GeoComplete, a novel framework that incorporates explicit 3D structural guidance to enforce geometric consistency in the completed regions, setting it apart from prior image-only approaches. GeoComplete introduces two key ideas: conditioning the diffusion process on projected point clouds to infuse geometric information, and applying target-aware masking to guide the model toward relevant reference cues. The framework features a dual-branch diffusion architecture.