Goto

Collaborating Authors

 annotation


Dual Swap Disentangling

Neural Information Processing Systems

Learning interpretable disentangled representations is a crucial yet challenging task. In this paper, we propose a weakly semi-supervised method, termed as Dual Swap Disentangling (DSD), for disentangling using both labeled and unlabeled data. Unlike conventional weakly supervised methods that rely on full annotations on the group of samples, we require only limited annotations on paired samples that indicate their shared attribute like the color. Our model takes the form of a dual autoencoder structure. To achieve disentangling using the labeled pairs, we follow a encoding-swap-decoding'' process twice on designated encoding parts and enforce the final outputs to approximate the input pairs. By isolating parts of the encoding and swapping them back and forth, we impose the dimension-wise modularity and portability of the encodings of the unlabeled samples, which implicitly encourages disentangling under the guidance of labeled pairs. This dual swap mechanism, tailored for semi-supervised setting, turns out to be very effective.




Category

Neural Information Processing Systems

Estimating the 6D object pose is one of the core problems in computer vision and robotics. It predicts the full configurations of rotation, translation and size of a given object, which has wide applications including Virtual Reality (VR) [2], scene understanding [30], and [42, 57, 31, 49]. There are twodirections in 6D object pose estimation.




RenderMe-360: A Large Digital Asset Library and Benchmarks Towards High-fidelity Head Avatars Dongwei Pan

Neural Information Processing Systems

Synthesizing high-fidelity head avatars is a central problem for computer vision and graphics. While head avatar synthesis algorithms have advanced rapidly, the best ones still face great obstacles in real-world scenarios.


ImOV3D: LearningOpen-VocabularyPointClouds 3DObjectDetectionfromOnly2DImages

Neural Information Processing Systems

Open-vocabulary 3D object detection (OV-3Det) aims to generalize beyond the limited number ofbasecategories labeled during thetraining phase. Thebiggest bottleneck is the scarcity of annotated 3D data, whereas 2D image datasets are abundantandrichlyannotated.