keypoint
Discovery of Latent 3D Keypoints via End-to-end Geometric Reasoning
This paper presents KeypointNet, an end-to-end geometric reasoning framework to learn an optimal set of category-specific keypoints, along with their detectors to predict 3D keypoints in a single 2D input image. We demonstrate this framework on 3D pose estimation task by proposing a differentiable pose objective that seeks the optimal set of keypoints for recovering the relative pose between two views of an object. Our network automatically discovers a consistent set of keypoints across viewpoints of a single object as well as across all object instances of a given object class. Importantly, we find that our end-to-end approach using no ground-truth keypoint annotations outperforms a fully supervised baseline using the same neural network architecture for the pose estimation task. The discovered 3D keypoints across the car, chair, and plane categories of ShapeNet are visualized at https://keypoints.github.io/
- North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
- Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Europe > Switzerland > Zürich > Zürich (0.04)
- Europe > Switzerland > Vaud > Lausanne (0.04)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
- Europe > United Kingdom > England > Nottinghamshire > Nottingham (0.05)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)
- North America > Canada (0.05)
- Europe > United Kingdom > England > Greater London > London (0.05)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Research Report > Experimental Study (0.93)
- Research Report > New Finding (0.68)
Harmony4D: A Video Dataset for In-The-Wild Close Human Interactions - Supplementary Materials - Rawal Khirodkar
For video demos of Harmony4D, please visit: Harmony4D Website. Please do not share the dataset with anyone as it is not publicly available yet. Harmony4D is a 75-minute video dataset collected using over 20 eqidistant, synchronized GoPro cameras. It consists of 1.66M images and 3.32M human instances, divided into 1.28M images for We manually clipped the videos into 208 sequences across 6 different activities, ensuring each sequence is at least 5 seconds (100 frames) long for temporal continuity. The 2D bboxes are derived from projected SMPL human vertices.
- Europe > Switzerland > Zürich > Zürich (0.14)
- Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.05)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- North America > Canada (0.04)
- North America > United States (0.04)
Consensus Learning with Deep Sets for Essential Matrix Estimation
Robust estimation of the essential matrix, which encodes the relative position and orientation of two cameras, is a fundamental step in structure from motion pipelines. Recent deep-based methods achieved accurate estimation by using complex network architectures that involve graphs, attention layers, and hard pruning steps.
- Europe > Switzerland > Zürich > Zürich (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Poland (0.04)
- Asia > Middle East > Israel (0.04)