Wang, Shun
MMTE: Corpus and Metrics for Evaluating Machine Translation Quality of Metaphorical Language
Wang, Shun, Zhang, Ge, Wu, Han, Loakman, Tyler, Huang, Wenhao, Lin, Chenghua
Machine Translation (MT) has developed rapidly since the release of Large Language Models and current MT evaluation is performed through comparison with reference human translations or by predicting quality scores from human-labeled data. However, these mainstream evaluation methods mainly focus on fluency and factual reliability, whilst paying little attention to figurative quality. In this paper, we investigate the figurative quality of MT and propose a set of human evaluation metrics focused on the translation of figurative language. We additionally present a multilingual parallel metaphor corpus generated by post-editing. Our evaluation protocol is designed to estimate four aspects of MT: Metaphorical Equivalence, Emotion, Authenticity, and Quality. In doing so, we observe that translations of figurative expressions display different traits from literal ones.
Improving Biomedical Abstractive Summarisation with Knowledge Aggregation from Citation Papers
Tang, Chen, Wang, Shun, Goldsack, Tomas, Lin, Chenghua
Abstracts derived from biomedical literature possess distinct domain-specific characteristics, including specialised writing styles and biomedical terminologies, which necessitate a deep understanding of the related literature. As a result, existing language models struggle to generate technical summaries that are on par with those produced by biomedical experts, given the absence of domain-specific background knowledge. This paper aims to enhance the performance of language models in biomedical abstractive summarisation by aggregating knowledge from external papers cited within the source article. We propose a novel attention-based citation aggregation model that integrates domain-specific knowledge from citation papers, allowing neural networks to generate summaries by leveraging both the paper content and relevant knowledge from citation papers. Furthermore, we construct and release a large-scale biomedical summarisation dataset that serves as a foundation for our research. Extensive experiments demonstrate that our model outperforms state-of-the-art approaches and achieves substantial improvements in abstractive biomedical text summarisation.
Metaphor Detection via Explicit Basic Meanings Modelling
Li, Yucheng, Wang, Shun, Lin, Chenghua, Frank, Guerin
One noticeable trend in metaphor detection is the embrace of linguistic theories such as the metaphor identification procedure (MIP) for model architecture design. While MIP clearly defines that the metaphoricity of a lexical unit is determined based on the contrast between its \textit{contextual meaning} and its \textit{basic meaning}, existing work does not strictly follow this principle, typically using the \textit{aggregated meaning} to approximate the basic meaning of target words. In this paper, we propose a novel metaphor detection method, which models the basic meaning of the word based on literal annotation from the training set, and then compares this with the contextual meaning in a target sentence to identify metaphors. Empirical results show that our method outperforms the state-of-the-art method significantly by 1.0\% in F1 score. Moreover, our performance even reaches the theoretical upper bound on the VUA18 benchmark for targets with basic annotations, which demonstrates the importance of modelling basic meanings for metaphor detection.
Metaphor Detection with Effective Context Denoising
Wang, Shun, Li, Yucheng, Lin, Chenghua, Barrault, Loïc, Guerin, Frank
Metaphor is a pervasive linguistic device, which Some recent efforts (Le et al., 2020; Song et al., attracts attention from both the fields of psycholinguistics 2021a) attempt to improve context modelling by and computational linguistics due to the explicitly leveraging the syntactic structure (e.g., key role it plays in the cognitive and communicative dependency parse tree) of a sentence in order to capture functions of language (Wilks, 1978; Lakoff important context words, where the parse trees and Johnson, 1980; Lakoff, 1993). Linguistically, are typically encoded with graph convolutional neural metaphor is defined as a figurative expression that networks. MelBERT (Choi et al., 2021) employs uses one or several words to represent another concept a simple chunking method which separates given the context, rather than taking the literal sub-sentences by commas.
FrameBERT: Conceptual Metaphor Detection with Frame Embedding Learning
Li, Yucheng, Wang, Shun, Lin, Chenghua, Guerin, Frank, Barrault, Loïc
In this paper, we propose FrameBERT, a RoBERTa-based model that can explicitly learn and incorporate FrameNet Embeddings for concept-level metaphor detection. FrameBERT not only achieves better or comparable performance to the state-of-the-art, but also is more explainable and interpretable compared to existing models, attributing to its ability of accounting for external knowledge of FrameNet.