animal pose
Deep Graph Pose: a semi-supervised deep graphical model for improved animal pose tracking
Noninvasive behavioral tracking of animals is crucial for many scientific investigations. Recent transfer learning approaches for behavioral tracking have considerably advanced the state of the art. Typically these methods treat each video frame and each object to be tracked independently. In this work, we improve on these methods (particularly in the regime of few training labels) by leveraging the rich spatiotemporal structures pervasive in behavioral video --- specifically, the spatial statistics imposed by physical constraints (e.g., paw to elbow distance), and the temporal statistics imposed by smoothness from frame to frame. We propose a probabilistic graphical model built on top of deep neural networks, Deep Graph Pose (DGP), to leverage these useful spatial and temporal constraints, and develop an efficient structured variational approach to perform inference in this model. The resulting semi-supervised model exploits both labeled and unlabeled frames to achieve significantly more accurate and robust tracking while requiring users to label fewer training frames. In turn, these tracking improvements enhance performance on downstream applications, including robust unsupervised segmentation of behavioral disentangled'' low-dimensional representations of the full behavioral video.
Review for NeurIPS paper: Deep Graph Pose: a semi-supervised deep graphical model for improved animal pose tracking
Weaknesses: - How much does the spatial and temporal potentials matter? The paper conducts experiments on DLC semi (supervised gaussian regularization) and DGP, however the influence of spatial and temporal potentials are not evaluated independently. This seems like an informative ablation study to do, especially since the paper claims the difference with prior work is that prior work does not consider temporal and spatial priors. There is a recent work OptiFlex by Liu et al which also uses temporal information, this should be cited. Only copared against a fully supervised method (DLC) and a baseline semi-supervised method which is an ablative version of the proposed approach (no temporal and structural priors).
Review for NeurIPS paper: Deep Graph Pose: a semi-supervised deep graphical model for improved animal pose tracking
This submission proposes a method animal 2D pose estimation and tracking given limited amounts of ground truth annotations. It initially received four reviews with diverging scores (5,6,7,4), which remained unchanged after the rebuttal. The reviewers appreciated importance of the application, solid empirical performance compared to DeepLabCut (including tests on downstream tasks) and insightful analysis of the learned representations. At the same time, the main concerns of the reviewers were limited methodological novelty beyond applying known methods to the new domain of animal tracking, as well as limitations in the empirical studies. This case was further discussed between the AC and the SAC, who arrived to the conclusion that the merits of this submission in advancing animal tracking outweigh its limitations. The final recommendation is to accept as a poster.
Generative Zoo
Niewiadomski, Tomasz, Yiannakidis, Anastasios, Cuevas-Velasquez, Hanz, Sanyal, Soubhik, Black, Michael J., Zuffi, Silvia, Kulits, Peter
The model-based estimation of 3D animal pose and shape from images enables computational modeling of animal behavior. Training models for this purpose requires large amounts of labeled image data with precise pose and shape annotations. However, capturing such data requires the use of multi-view or marker-based motion-capture systems, which are impractical to adapt to wild animals in situ and impossible to scale across a comprehensive set of animal species. Some have attempted to address the challenge of procuring training data by pseudo-labeling individual real-world images through manual 2D annotation, followed by 3D-parameter optimization to those labels. While this approach may produce silhouette-aligned samples, the obtained pose and shape parameters are often implausible due to the ill-posed nature of the monocular fitting problem. Sidestepping real-world ambiguity, others have designed complex synthetic-data-generation pipelines leveraging video-game engines and collections of artist-designed 3D assets. Such engines yield perfect ground-truth annotations but are often lacking in visual realism and require considerable manual effort to adapt to new species or environments. Motivated by these shortcomings, we propose an alternative approach to synthetic-data generation: rendering with a conditional image-generation model. We introduce a pipeline that samples a diverse set of poses and shapes for a variety of mammalian quadrupeds and generates realistic images with corresponding ground-truth pose and shape parameters. To demonstrate the scalability of our approach, we introduce GenZoo, a synthetic dataset containing one million images of distinct subjects. We train a 3D pose and shape regressor on GenZoo, which achieves state-of-the-art performance on a real-world animal pose and shape estimation benchmark, despite being trained solely on synthetic data. https://genzoo.is.tue.mpg.de
- Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.05)
- Europe > Switzerland (0.04)
- North America > United States > Michigan > Wayne County > Taylor (0.04)
- (2 more...)
Deep Graph Pose: a semi-supervised deep graphical model for improved animal pose tracking
Noninvasive behavioral tracking of animals is crucial for many scientific investigations. Recent transfer learning approaches for behavioral tracking have considerably advanced the state of the art. Typically these methods treat each video frame and each object to be tracked independently. In this work, we improve on these methods (particularly in the regime of few training labels) by leveraging the rich spatiotemporal structures pervasive in behavioral video --- specifically, the spatial statistics imposed by physical constraints (e.g., paw to elbow distance), and the temporal statistics imposed by smoothness from frame to frame. We propose a probabilistic graphical model built on top of deep neural networks, Deep Graph Pose (DGP), to leverage these useful spatial and temporal constraints, and develop an efficient structured variational approach to perform inference in this model.
Machine learning animal poses to understand behavior
Studying animal behavior can reveal how animals make decisions based on what they sense in their environment, but measuring behavior can be difficult and time-consuming. Computer programs that measure and analyze animal movement have made these studies faster and easier to complete. These tools have also made more advanced behavioral experiments possible, which have yielded new insights about how the brain organizes behavior. Recently, scientists have started using new machine learning tools called deep neural networks to measure animal behavior. These tools learn to measure animal posture – the positions of an animal's body parts in space – directly from real data, such as images or videos, without being explicitly programmed with instructions to perform the task.