Direct Multi-view Multi-person 3D Pose Estimation (Supplementary Material) Tao Wang

Aug-15-2025, 01:24:03 GMT–Neural Information Processing Systems

Figure S1: (a) Illustration of the proposed hierarchical query embedding and the input-dependent query adaptation schemes. It consist of a self-attention, a projective attention and a feed-forward network (FFN) with residual connections. Fig. S1 (a) illustrates our proposed hierarchical query The decoder of MvP transformer consists of multiple decoder layers for regressing 3D joint locations progressively. Fig. S1 (b) demonstrates the detailed architecture of a decoder layer, Results are shown in Table S1. Table S1: Results of replacing camera ray directions with 2D coordinates in RayConv.Positional Input AP We further investigate the effectiveness of the proposed projective attention by comparing it with the dense dot product attention, i.e., conducting Results are given in Table S2.

pose estimation, projective attention, supplementary material, (13 more...)

Neural Information Processing Systems

Aug-15-2025, 01:24:03 GMT

Conferences PDF

Add feedback

Country:
- Asia > Singapore (0.05)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks (0.51)
  - Vision > Video Understanding (0.45)

Duplicate Docs Excel Report

Title
6da9003b743b65f4c0ccd295cc484e57-Supplemental.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found