Sharma, Alok
Learnings from Technological Interventions in a Low Resource Language: Enhancing Information Access in Gondi
Mehta, Devansh, Diddee, Harshita, Saxena, Ananya, Shukla, Anurag, Santy, Sebastin, Mothilal, Ramaravind Kommiya, Srivastava, Brij Mohan Lal, Sharma, Alok, Prasad, Vishnu, U, Venkanna, Bali, Kalika
The primary obstacle to developing technologies for low-resource languages is the lack of representative, usable data. In this paper, we report the deployment of technology-driven data collection methods for creating a corpus of more than 60,000 translations from Hindi to Gondi, a low-resource vulnerable language spoken by around 2.3 million tribal people in south and central India. During this process, we help expand information access in Gondi across 2 different dimensions (a) The creation of linguistic resources that can be used by the community, such as a dictionary, children's stories, Gondi translations from multiple sources and an Interactive Voice Response (IVR) based mass awareness platform; (b) Enabling its use in the digital domain by developing a Hindi-Gondi machine translation model, which is compressed by nearly 4 times to enable it's edge deployment on low-resource edge devices and in areas of little to no internet connectivity. We also present preliminary evaluations of utilizing the developed machine translation model to provide assistance to volunteers who are involved in collecting more data for the target language. Through these interventions, we not only created a refined and evaluated corpus of 26,240 Hindi-Gondi translations that was used for building the translation model but also engaged nearly 850 community members who can help take Gondi onto the internet.
How to Backpropagate through Hungarian in Your DETR?
Chen, Lingji, Sharma, Alok, Shirore, Chinmay, Zhang, Chengjie, Buddharaju, Balarama Raju
The DEtection TRansformer (DETR) approach, which uses a transformer encoder-decoder architecture and a set-based global loss, has become a building block in many transformer based applications. However, as originally presented, the assignment cost and the global loss are not aligned, i.e., reducing the former is likely but not guaranteed to reduce the latter. And the issue of gradient is ignored when a combinatorial solver such as Hungarian is used. In this paper we show that the global loss can be expressed as the sum of an assignment-independent term, and an assignment-dependent term which can be used to define the assignment cost matrix. Recent results on generalized gradients of optimal assignment cost with respect to parameters of an assignment problem are then used to define generalized gradients of the loss with respect to network parameters, and backpropagation is carried out properly. Our experiments using the same loss weights show interesting convergence properties and a potential for further performance improvements.
Memory Capacity of Neural Turing Machines with Matrix Representation
Renanse, Animesh, Chandra, Rohitash, Sharma, Alok
It is well known that recurrent neural networks (RNNs) faced limitations in learning long-term dependencies that have been addressed by memory structures in long short-term memory (LSTM) networks. Matrix neural networks feature matrix representation which inherently preserves the spatial structure of data and has the potential to provide better memory structures when compared to canonical neural networks that use vector representation. Neural Turing machines (NTMs) are novel RNNs that implement notion of programmable computers with neural network controllers to feature algorithms that have copying, sorting, and associative recall tasks. In this paper, we study the augmentation of memory capacity with a matrix representation of RNNs and NTMs (MatNTMs). We investigate if matrix representation has a better memory capacity than the vector representations in conventional neural networks. We use a probabilistic model of the memory capacity using Fisher information and investigate how the memory capacity for matrix representation networks are limited under various constraints, and in general, without any constraints. In the case of memory capacity without any constraints, we found that the upper bound on memory capacity to be $N^2$ for an $N\times N$ state matrix. The results from our experiments using synthetic algorithmic tasks show that MatNTMs have a better learning capacity when compared to its counterparts.