Review for NeurIPS paper: Inference Stage Optimization for Cross-scenario 3D Human Pose Estimation

Neural Information Processing Systems 

Weaknesses: Though the authors shadow many insights on why ISO performs well, I still have questions about the Shared Feature Extractor, SSL Head, FSL Head. As the SSL is from existing work and the main contribution is combination of SSL with FSL, answering the questions clearly is important. Which kind of feature, information is shared in the Shared Feature Extractor? How much will it divert when trained on new target data so that is causes the FLS head fail? What information is kept in the FSL head?