Boosting Skeleton-based Zero-Shot Action Recognition with Training-Free Test-Time Adaptation
–Neural Information Processing Systems
We introduce Skeleton-Cache, the first training-free test-time adaptation framework for skeleton-based zero-shot action recognition (SZAR), aimed at improving model generalization to unseen actions during inference. Skeleton-Cache reformulates inference as a lightweight retrieval process over a non-parametric cache that stores structured skeleton representations, combining both global and fine-grained local descriptors. To guide the fusion of descriptor-wise predictions, we leverage the semantic reasoning capabilities of large language models (LLMs) to assign classspecific importance weights. By integrating these structured descriptors with LLMguided semantic priors, Skeleton-Cache dynamically adapts to unseen actions without any additional training or access to training data. Extensive experiments on NTURGB+D 60/120 and PKU-MMDII demonstrate that Skeleton-Cache consistently boosts the performance of various SZAR backbones under both zeroshot and generalized zero-shot settings.
Neural Information Processing Systems
Jun-19-2026, 23:22:52 GMT
- Genre:
- Research Report
- Experimental Study (1.00)
- New Finding (0.67)
- Research Report
- Industry:
- Information Technology (0.67)
- Technology: