Overview

Neural Information Processing Systems 

This supplementary material presents more details and additional results not included in the main paper due to the page limitation. The list of items included are: Description of augmentation settings for robustness benchmarking in Section A More experiment setup and details in Section B Comparison with PyMAF-X in Section C Analysis of the subpar performance on AGORA test set in Section D Ablation of different modules on the Body subnetwork in Section E Quantitative and qualitative and comparisons for pixel alignment in Section F Examples of failure cases in Section G Analysis of embedding similarity in Section H Discussion on pose (rotation) versus keypoint representation in Section I Extra comparisons against SOTA body networks in Section J Training and inference time in Section K Accuracy of derived part bounding boxes in Section L Qualitative comparisons of RoboSMPLX's Hand, Face and Body subnetworks under augmentations in Section M Quantitative and qualitative comparisons of RoboSMPLX's wholebody model in Section N In the selection of augmentations, we opted for a set of ten commonly encountered augmentations that could be benchmarked in a controlled setting. We also ensure that the selected values for manipulation fall within a realistic range. We used the following augmentations: 1. Vertical translation: We shifted the image by factors relative to the image size. For instance, a +0.1 shift corresponds to a 10% upward movement, while a -0.1 shift represents a 10% downward movement.