Joshi, Aditya
Natural Language Processing for Dialects of a Language: A Survey
Joshi, Aditya, Dabre, Raj, Kanojia, Diptesh, Li, Zhuang, Zhan, Haolan, Haffari, Gholamreza, Dippold, Doris
State-of-the-art natural language processing (NLP) models are trained on massive training corpora, and report a superlative performance on evaluation datasets. This survey delves into an important attribute of these datasets: the dialect of a language. Motivated by the performance degradation of NLP models for dialectic datasets and its implications for the equity of language technologies, we survey past research in NLP for dialects in terms of datasets, and approaches. We describe a wide range of NLP tasks in terms of two categories: natural language understanding (NLU) (for tasks such as dialect classification, sentiment analysis, parsing, and NLU benchmarks) and natural language generation (NLG) (for summarisation, machine translation, and dialogue systems). The survey is also broad in its coverage of languages which include English, Arabic, German among others. We observe that past work in NLP concerning dialects goes deeper than mere dialect classification, and . This includes early approaches that used sentence transduction that lead to the recent approaches that integrate hypernetworks into LoRA. We expect that this survey will be useful to NLP researchers interested in building equitable language technologies by rethinking LLM benchmarks and model architectures.
Overview of the 2023 ICON Shared Task on Gendered Abuse Detection in Indic Languages
Vaidya, Aatman, Arora, Arnav, Joshi, Aditya, Prabhakar, Tarunima
This paper reports the findings of the ICON 2023 on Gendered Abuse Detection in Indic Languages. The shared task deals with the detection of gendered abuse in online text. The shared task was conducted as a part of ICON 2023, based on a novel dataset in Hindi, Tamil and the Indian dialect of English. The participants were given three subtasks with the train dataset consisting of approximately 6500 posts sourced from Twitter. For the test set, approximately 1200 posts were provided. The shared task received a total of 9 registrations. The best F-1 scores are 0.616 for subtask 1, 0.572 for subtask 2 and, 0.616 and 0.582 for subtask 3. The paper contains examples of hateful content owing to its topic.
Relation Extraction from News Articles (RENA): A Tool for Epidemic Surveillance
Hong, Jaeff, Dung, Duong, Hutchinson, Danielle, Akhtar, Zubair, Chen, Rosalie, Dawson, Rebecca, Joshi, Aditya, Lim, Samsung, MacIntyre, C Raina, Gurdasani, Deepti
Relation Extraction from News Articles (RENA) is a browser-based tool designed to extract key entities and their semantic relationships in English language news articles related to infectious diseases. Constructed using the React framework, this system presents users with an elegant and user-friendly interface. It enables users to input a news article and select from a choice of two models to generate a comprehensive list of relations within the provided text. As a result, RENA allows real-time parsing of news articles to extract key information for epidemic surveillance, contributing to EPIWATCH, an open-source intelligence-based epidemic warning system.
Stacking the Odds: Transformer-Based Ensemble for AI-Generated Text Detection
Nguyen, Duke, Naing, Khaing Myat Noe, Joshi, Aditya
This paper reports our submission under the team name `SynthDetectives' to the ALTA 2023 Shared Task. We use a stacking ensemble of Transformers for the task of AI-generated text detection. Our approach is novel in terms of its choice of models in that we use accessible and lightweight models in the ensemble. We show that ensembling the models results in an improved accuracy in comparison with using them individually. Our approach achieves an accuracy score of 0.9555 on the official test data provided by the shared task organisers.
Evaluation of large language models using an Indian language LGBTI+ lexicon
Joshi, Aditya, Rawat, Shruta, Dange, Alpana
Large language models (LLMs) are typically evaluated on the basis of task-based benchmarks such as MMLU. Such benchmarks do not examine responsible behaviour of LLMs in specific contexts. This is particularly true in the LGBTI+ context where social stereotypes may result in variation in LGBTI+ terminology. Therefore, domain-specific lexicons or dictionaries may be useful as a representative list of words against which the LLM's behaviour needs to be evaluated. This paper presents a methodology for evaluation of LLMs using an LGBTI+ lexicon in Indian languages. The methodology consists of four steps: formulating NLP tasks relevant to the expected behaviour, creating prompts that test LLMs, using the LLMs to obtain the output and, finally, manually evaluating the results. Our qualitative analysis shows that the three LLMs we experiment on are unable to detect underlying hateful content. Similarly, we observe limitations in using machine translation as means to evaluate natural language understanding in languages other than English. The methodology presented in this paper can be useful for LGBTI+ lexicons in other languages as well as other domain-specific lexicons. The work done in this paper opens avenues for responsible behaviour of LLMs, as demonstrated in the context of prevalent social perception of the LGBTI+ community.
Queer In AI: A Case Study in Community-Led Participatory AI
QueerInAI, Organizers Of, :, null, Ovalle, Anaelia, Subramonian, Arjun, Singh, Ashwin, Voelcker, Claas, Sutherland, Danica J., Locatelli, Davide, Breznik, Eva, Klubiฤka, Filip, Yuan, Hang, J, Hetvi, Zhang, Huan, Shriram, Jaidev, Lehman, Kruno, Soldaini, Luca, Sap, Maarten, Deisenroth, Marc Peter, Pacheco, Maria Leonor, Ryskina, Maria, Mundt, Martin, Agarwal, Milind, McLean, Nyx, Xu, Pan, Pranav, A, Korpan, Raj, Ray, Ruchira, Mathew, Sarah, Arora, Sarthak, John, ST, Anand, Tanvi, Agrawal, Vishakha, Agnew, William, Long, Yanan, Wang, Zijie J., Talat, Zeerak, Ghosh, Avijit, Dennler, Nathaniel, Noseworthy, Michael, Jha, Sharvani, Baylor, Emi, Joshi, Aditya, Bilenko, Natalia Y., McNamara, Andrew, Gontijo-Lopes, Raphael, Markham, Alex, Dวng, Evyn, Kay, Jackie, Saraswat, Manu, Vytla, Nikhil, Stark, Luke
We present Queer in AI as a case study for community-led participatory design in AI. We examine how participatory design and intersectional tenets started and shaped this community's programs over the years. We discuss different challenges that emerged in the process, look at ways this organization has fallen short of operationalizing participatory and intersectional principles, and then assess the organization's impact. Queer in AI provides important lessons and insights for practitioners and theorists of participatory methods broadly through its rejection of hierarchy in favor of decentralization, success at building aid and programs by and for the queer community, and effort to change actors and institutions outside of the queer community. Finally, we theorize how communities like Queer in AI contribute to the participatory design in AI more broadly by fostering cultures of participation in AI, welcoming and empowering marginalized participants, critiquing poor or exploitative participatory practices, and bringing participation to institutions outside of individual research projects. Queer in AI's work serves as a case study of grassroots activism and participatory methods within AI, demonstrating the potential of community-led participatory methods and intersectional praxis, while also providing challenges, case studies, and nuanced insights to researchers developing and using participatory methods.
Applications and Challenges of Sentiment Analysis in Real-life Scenarios
Kanojia, Diptesh, Joshi, Aditya
Sentiment analysis has benefited from the availability of lexicons and benchmark datasets created over decades of research. However, its applications to the real world are a driving force for research in SA. This chapter describes some of these applications and related challenges in real-life scenarios. In this chapter, we focus on five applications of SA: health, social policy, e-commerce, digital humanities and other areas of NLP. This chapter is intended to equip an NLP researcher with the `what', `why' and `how' of applications of SA: what is the application about, why it is important and challenging and how current research in SA deals with the application. We note that, while the use of deep learning techniques is a popular paradigm that spans these applications, challenges around privacy and selection bias of datasets is a recurring theme across several applications.
Sarcasm Suite: A Browser-Based Engine for Sarcasm Detection and Generation
Joshi, Aditya (IITB-Monash Research Academy) | Kanojia, Diptesh (IITB-Monash Research Academy ) | Bhattacharyya, Pushpak (Indian Institute of Technology Bombay) | Carman, Mark (Monash University, Australia)
Sarcasm Suite is a browser-based engine that deploys five of our past papers in sarcasm detection and generation. The sarcasm detection modules use four kinds of incongruity: sentiment incongruity, semantic incongruity, historical context incongruity and conversational context incongruity. The sarcasm generation module is a chatbot that responds sarcastically to user input. With a visually appealing interface that indicates predictions using `faces' of our co-authors from our past papers, Sarcasm Suite is our first demonstration of our work in computational sarcasm.
EmoGram: An Open-Source Time Sequence-Based Emotion Tracker and Its Innovative Applications
Joshi, Aditya (Monash Research Academy) | Tripathi, Vaibhav (Indian Institute of Technology Bombay) | Soni, Ravindra (Indian Institute of Technology Bombay) | Bhattacharyya, Pushpak (Indian Institute of Technology Bombay) | Carman, Mark James (Monash University)
In this paper, we present an open-source emotion tracker and its innovative applications. Our tracker, EmoGram, tracks emotion changes for a sequence of textual units. It is versatile in terms of the textual unit (tweets, sentences in discourse, etc.) and also what constitutes the time sequence (timestamps of tweets, discourse nature of text, etc.). We demonstrate the utility of our system through our applications: a sequence of commentaries in cricket matches, a sequence of dialogues in a play, and a sequence of tweets related to the Maggi controversy in India in 2015. That one system can be used for these applications is the merit of EmoGram.