Goto

Collaborating Authors

 Bhatia, Archna


Overview of MWE history, challenges, and horizons: standing at the 20th anniversary of the MWE workshop series via MWE-UD2024

arXiv.org Artificial Intelligence

Starting in 2003 when the first MWE workshop was held with ACL in Sapporo, Japan, this year, the joint workshop of MWE-UD co-located with the LREC-COLING 2024 conference marked the 20th anniversary of MWE workshop events over the past nearly two decades. Standing at this milestone, we look back to this workshop series and summarise the research topics and methodologies researchers have carried out over the years. We also discuss the current challenges that we are facing and the broader impacts/synergies of MWE research within the CL and NLP fields. Finally, we give future research perspectives. We hope this position paper can help researchers, students, and industrial practitioners interested in MWE get a brief but easy understanding of its history, current, and possible future.


UCxn: Typologically Informed Annotation of Constructions Atop Universal Dependencies

arXiv.org Artificial Intelligence

The Universal Dependencies (UD) project has created an invaluable collection of treebanks with contributions in over 140 languages. However, the UD annotations do not tell the full story. Grammatical constructions that convey meaning through a particular combination of several morphosyntactic elements -- for example, interrogative sentences with special markers and/or word orders -- are not labeled holistically. We argue for (i) augmenting UD annotations with a 'UCxn' annotation layer for such meaning-bearing grammatical constructions, and (ii) approaching this in a typologically informed way so that morphosyntactic strategies can be compared across languages. As a case study, we consider five construction families in ten languages, identifying instances of each construction in UD treebanks through the use of morphosyntactic patterns. In addition to findings regarding these particular constructions, our study yields important insights on methodology for describing and identifying constructions in language-general and language-particular ways, and lays the foundation for future constructional enrichment of UD treebanks.


Event Nugget Detection and Argument Extraction with DISCERN

AAAI Conferences

This paper addresses the problem of detecting information about events from unstructured text. An event-detection system, DISCERN, is presented; its three variants DISCERN- R (rule-based), DISCERN-ML (machine-learned), and DISCERN-C (combined), were evaluated in the NIST TAC KBP 2015 Event Nugget Detection and Event Argument Extraction and Linking tasks. Three contributions of this work are: (a) an approach to collapsing support verb and event nominals that improved recall of argument linking, (b) a new linguist-in-the-loop paradigm that enables quick changes to linguistic rules and examination of their effect on pre- cision and recall at runtime, (c) an analysis of the synergy between the semantic and syntactic features. Results of experimentation with event-detection approaches indicate that linguistically-informed rules can improve precision and machine-learned systems can improve recall. Future refinements to the combination of linguistic and machine learning approaches may involve making better use of the complementarity of these approaches.