Textless NLP -- Zero Resource Challenge with Low Resource Compute

Ramadass, Krithiga, Singh, Abrit Pal, J, Srihari, Kalyani, Sheetal

arXiv.org Artificial Intelligence 

Coding (VQ-CPC) [8] as the encoder in our speech processing The availability of text data for low-resource languages has pipeline. The input audio files are preprocessed and always been a challenge and transfer learning from multilingual extracted as log-Mel spectrograms. The initial processing models has its own limitations. End-to-End spoken systems involves convolution and normalization layers to extract highlevel without involving text have received significant attention features. These features are then passed through an in the recent years. The Zero-Resource challenge (ZRC) [1] auto-regressive network, which predicts future representations has enabled addressing the low-resource language representation of the input based on past information. One of the key problem and has been a significant driver in this area. In characteristics of VQ-CPC is its use of vector quantization as the acoustic unit discovery task for ZRC, high-dimensional a bottleneck to discretize the continuous embeddings extracted input speech data is mapped to its latent representation to by the autoregressive network into a finite set of discrete codes.