Goto

Collaborating Authors

 South America







Two-StreamNetworkforSignLanguageRecognition andTranslation

Neural Information Processing Systems

Weadoptidentical dataaugmentationsforRGBvideos andheatmap sequences to maintain spatial and temporal consistency. SingleStream-SLTwhich only utilizes asingle video encoder without modelling keypoints serves as our baseline. TwoStream-SLT-V/K/J denotes the network where only one translation network is attached onto the video head/keypoint head/joint head. The averaged probabilities are used to decode text sequences. In each of the variants, only a single translation network is appended onto the video head, keypoint head, or joint head.



A Influence Function on Bias and Extension to

Neural Information Processing Systems

For real-world visual datasets (like facial dataset or ImageNet), the unavailability of strict counterfactual data is a common challenge.