A Appendix Overview
–Neural Information Processing Systems
We add position embeddings and three kinds of token type embeddings (i.e., word token, context patch token, region patch token) to them. We then apply three layers of transformer blocks to jointly encode the input sequence and take the output [CLS] token to predict the Shapley interaction estimation and corresponding uncertainty, separately.
Neural Information Processing Systems
May-29-2025, 04:48:43 GMT
- Technology: