Learning segmentation from point trajectories

Neural Information Processing Systems

We consider the problem of segmenting objects in videos based on their motion and no other forms of supervision. Prior work has often approached this problem by using the principle of common fate, namely the fact that the motion of points that belong to the same object is strongly correlated. However, most authors have only considered instantaneous motion from optical flow. In this work, we present a way to train a segmentation network using long-term point trajectories as a supervisory signal to complement optical flow. The key difficulty is that long-term motion, unlike instantaneous motion, is difficult to model - any parametric approximation is unlikely to capture complex motion patterns over long periods of time. We instead draw inspiration from subspace clustering approaches, proposing a loss function that seeks to group the trajectories into low-rank matrices where the motion of object points can be approximately explained as a linear combination of other point tracks. Our method outperforms the prior art on motion-based segmentation, which shows the utility of long-term motion and the effectiveness of our formulation.


Optimal Sparse Linear Encoders and Sparse PCA

Neural Information Processing Systems

Principal components analysis (PCA) is the optimal linear encoder of data. Sparse linear encoders (e.g., sparse PCA) produce more interpretable features that can promote better generalization.


Exploration Bonus for Regret Minimization in Discrete and Continuous Average Reward MDPs

Neural Information Processing Systems

The exploration bonus is an effective approach to manage the explorationexploitation trade-off in Markov Decision Processes (MDPs). While it has been analyzed in infinite-horizon discounted and finite-horizon problems, we focus on designing and analysing the exploration bonus in the more challenging infinitehorizon undiscounted setting.


de594ef5c314372edec29b93cab9d72e-AuthorFeedback.pdf

Neural Information Processing Systems

We would like to thank the reviewers for their insightful comments. Tewari), but the algorithmic approach is very different. Reviewer 2. The best known minimax regret lower bound for discrete communicating MDPs is DSAT and the Politex paper is not simple due to the suboptimal dependence in T. The absence of explicit dependence in ฮ“ is due to As argued in (Fruit et al. Reviewer 4. We show empirically that SCCAL+ is a very stable algorithm: in the experiments we did not optimize This algorithm is inspired by the recent results in finite-horizon, but further work is needed to study its regret. In the final version, we will provide more interpretation of the experiments (in particular in the continuous case).


3D Focusing-and-Matching Network for Multi-Instance Point Cloud Registration

Neural Information Processing Systems

Multi-instance point cloud registration aims to estimate the pose of all instances of a model point cloud in the whole scene. Existing methods all adopt the strategy of first obtaining the global correspondence and then clustering to obtain the pose of each instance. However, due to the cluttered and occluded objects in the scene, it is difficult to obtain an accurate correspondence between the model point cloud and all instances in the scene. To this end, we propose a simple yet powerful 3D focusing-and-matching network for multi-instance point cloud registration by learning the multiple pair-wise point cloud registration. Specifically, we first present a 3D multi-object focusing module to locate the center of each object and generate object proposals.


Context-Aware Testing: A New Paradigm for Model Testing with Large Language Models

Neural Information Processing Systems

The predominant de facto paradigm of testing ML models relies on either using only held-out data to compute aggregate evaluation metrics or by assessing the performance on different subgroups. However, such data-only testing methods operate under the restrictive assumption that the available empirical data is the sole input for testing ML models, disregarding valuable contextual information that could guide model testing. In this paper, we challenge the go-to approach of data-only testing and introduce context-aware testing (CAT) which uses context as an inductive bias to guide the search for meaningful model failures. We instantiate the first CAT system, SMART Testing, which employs large language models to hypothesize relevant and likely failures, which are evaluated on data using a selffalsification mechanism. Through empirical evaluations in diverse settings, we show that SMART automatically identifies more relevant and impactful failures than alternatives, demonstrating the potential of CAT as a testing paradigm.


Recurrent Registration Neural Networks for Deformable Image Registration

Neural Information Processing Systems

Parametric spatial transformation models have been successfully applied to image registration tasks. In such models, the transformation of interest is parameterized by a fixed set of basis functions as for example B-splines. Each basis function is located on a fixed regular grid position among the image domain because the transformation of interest is not known in advance. As a consequence, not all basis functions will necessarily contribute to the final transformation which results in a non-compact representation of the transformation.


Stochastic Optimal Control Matching

Neural Information Processing Systems

Stochastic optimal control, which has the goal of driving the behavior of noisy systems, is broadly applicable in science, engineering and artificial intelligence. Our work introduces Stochastic Optimal Control Matching (SOCM), a novel Iterative Diffusion Optimization (IDO) technique for stochastic optimal control that stems from the same philosophy as the conditional score matching loss for diffusion models. That is, the control is learned via a least squares problem by trying to fit a matching vector field. The training loss, which is closely connected to the cross-entropy loss, is optimized with respect to both the control function and a family of reparameterization matrices which appear in the matching vector field. The optimization with respect to the reparameterization matrices aims at minimizing the variance of the matching vector field. Experimentally, our algorithm achieves lower error than all the existing IDO techniques for stochastic optimal control for three out of four control problems, in some cases by an order of magnitude. The key idea underlying SOCM is the path-wise reparameterization trick, a novel technique that may be of independent interest.


Figure 6: The designed prompt of automatic evaluation for Task 3

Neural Information Processing Systems

Give you a sentence or question that contains Give you a sentence or question that contains some irrationality or humor. Give you four some irrationality or humor. You need to choose options, you need to choose the one that best a type from the "candidate types" that best fits Figure 4: Our designed prompts without the Chain-of-Thought idea. Task 3(a) is for the texts that are not expressed in the form of inquiries. Task 3(b) is for inquiries.


When LLMs Meet Cunning Texts: A Fallacy Understanding Benchmark for Large Language Models Yinghui Li

Neural Information Processing Systems

Recently, Large Language Models (LLMs) make remarkable evolutions in language understanding and generation. Following this, various benchmarks for measuring all kinds of capabilities of LLMs have sprung up. In this paper, we challenge the reasoning and understanding abilities of LLMs by proposing a FaLlacy Understanding Benchmark (FLUB) containing cunning texts that are easy for humans to understand but difficult for models to grasp. Specifically, the cunning texts that FLUB focuses on mainly consist of the tricky, humorous, and misleading texts collected from the real internet environment. And we design three tasks with increasing difficulty in the FLUB benchmark to evaluate the fallacy understanding ability of LLMs. Based on FLUB, we investigate the performance of multiple representative and advanced LLMs, reflecting our FLUB is challenging and worthy of more future study. Interesting discoveries and valuable insights are achieved in our extensive experiments and detailed analyses. We hope that our benchmark can encourage the community to improve LLMs' ability to understand fallacies. Our data and codes are available at https://github.com/THUKElab/FLUB.