Goto

Collaborating Authors

 Machine Translation


Introducing Rhetorical Parallelism Detection: A New Task with Datasets, Metrics, and Baselines

arXiv.org Artificial Intelligence

Rhetoric, both spoken and written, involves not only content but also style. One common stylistic tool is $\textit{parallelism}$: the juxtaposition of phrases which have the same sequence of linguistic ($\textit{e.g.}$, phonological, syntactic, semantic) features. Despite the ubiquity of parallelism, the field of natural language processing has seldom investigated it, missing a chance to better understand the nature of the structure, meaning, and intent that humans convey. To address this, we introduce the task of $\textit{rhetorical parallelism detection}$. We construct a formal definition of it; we provide one new Latin dataset and one adapted Chinese dataset for it; we establish a family of metrics to evaluate performance on it; and, lastly, we create baseline systems and novel sequence labeling schemes to capture it. On our strictest metric, we attain $F_{1}$ scores of $0.40$ and $0.43$ on our Latin and Chinese datasets, respectively.


Women Are Beautiful, Men Are Leaders: Gender Stereotypes in Machine Translation and Language Modeling

arXiv.org Artificial Intelligence

We present GEST -- a new dataset for measuring gender-stereotypical reasoning in masked LMs and English-to-X machine translation systems. GEST contains samples that are compatible with 9 Slavic languages and English for 16 gender stereotypes about men and women (e.g., Women are beautiful, Men are leaders). The definition of said stereotypes was informed by gender experts. We used GEST to evaluate 11 masked LMs and 4 machine translation systems. We discovered significant and consistent amounts of stereotypical reasoning in almost all the evaluated models and languages.


Controlling Pre-trained Language Models for Grade-Specific Text Simplification

arXiv.org Artificial Intelligence

Text simplification (TS) systems rewrite text to make it more readable while preserving its content. However, what makes a text easy to read depends on the intended readers. Recent work has shown that pre-trained language models can simplify text using a wealth of techniques to control output simplicity, ranging from specifying only the desired reading grade level, to directly specifying low-level edit operations. Yet it remains unclear how to set these control parameters in practice. Existing approaches set them at the corpus level, disregarding the complexity of individual inputs and considering only one level of output complexity. In this work, we conduct an empirical study to understand how different control mechanisms impact the adequacy and simplicity of text simplification systems. Based on these insights, we introduce a simple method that predicts the edit operations required for simplifying a text for a specific grade level on an instance-per-instance basis. This approach improves the quality of the simplified outputs over corpus-level search-based heuristics.


Artificial intelligence can now decipher 'world's oldest languages' that were carved into 5,000-year-old stones as fast as Google translate

Daily Mail - Science & tech

The mysterious dialect of our ancient ancestors could finally be deciphered in full thanks to artificial intelligence. A million cuneiform tablets still exist in the world, experts estimate, but these writings left behind by ancient Mesopotamians require tedious work by archaeologists to translate and catalog their contents. It has been estimated that 90 percent of cuneiform texts remain untranslated. But now, a team of German researchers has figured out a new way to train computers to recognize cuneiform and even make the contents of millennia-old tablets searchable like a website, making it possible to digitize and assemble larger libraries of these ancient texts. This could unlock previously unknown details about ancient life, as the tablets contained details about feats as significant as temple construction, all the way down to squabbles as petty as customer service complaints.


Introduction to Transformers: an NLP Perspective

arXiv.org Artificial Intelligence

Transformers have dominated empirical machine learning models of natural language processing. In this paper, we introduce basic concepts of Transformers and present key techniques that form the recent advances of these models. This includes a description of the standard Transformer architecture, a series of model refinements, and common applications. Given that Transformers and related deep learning techniques might be evolving in ways we have never seen, we cannot dive into all the model details or cover all the technical areas. Instead, we focus on just those concepts that are helpful for gaining a good understanding of Transformers and their variants. We also summarize the key ideas that impact this field, thereby yielding some insights into the strengths and limitations of these models.


Towards A Foundation Model For Trajectory Intelligence

arXiv.org Artificial Intelligence

Abstract--We present the results of training a large trajectory model using real-world user check-in data. Our approach follows a pre-train and fine-tune paradigm, where a base model is pre-trained via masked trajectory modeling and then adapted through fine-tuning for various downstream tasks. To address challenges posed by noisy data and large spatial vocabularies, we propose a novel spatial tokenization block. Our empirical analysis utilizes a comprehensive dataset of over 2 billion checkins generated by more than 6 million users. Through fine-tuning on 3 downstream tasks we demonstrate that our base model has effectively learned valuable underlying patterns in raw data, enabling its application in meaningful trajectory intelligence tasks. Despite some limitations, we believe this work represents an important step forward in the realization of a foundation model for trajectory intelligence.


INarIG: Iterative Non-autoregressive Instruct Generation Model For Word-Level Auto Completion

arXiv.org Artificial Intelligence

Computer-aided translation (CAT) aims to enhance human translation efficiency and is still important in scenarios where machine translation cannot meet quality requirements. One fundamental task within this field is Word-Level Auto Completion (WLAC). WLAC predicts a target word given a source sentence, translation context, and a human typed character sequence. Previous works either employ word classification models to exploit contextual information from both sides of the target word or directly disregarded the dependencies from the right-side context. Furthermore, the key information, i.e. human typed sequences, is only used as prefix constraints in the decoding module. In this paper, we propose the INarIG (Iterative Non-autoregressive Instruct Generation) model, which constructs the human typed sequence into Instruction Unit and employs iterative decoding with subwords to fully utilize input information given in the task. Our model is more competent in dealing with low-frequency words (core scenario of this task), and achieves state-of-the-art results on the WMT22 and benchmark datasets, with a maximum increase of over 10% prediction accuracy.


Mukhyansh: A Headline Generation Dataset for Indic Languages

arXiv.org Artificial Intelligence

The task of headline generation within the realm of Natural Language Processing (NLP) holds immense significance, as it strives to distill the true essence of textual content into concise and attention-grabbing summaries. While noteworthy progress has been made in headline generation for widely spoken languages like English, there persist numerous challenges when it comes to generating headlines in low-resource languages, such as the rich and diverse Indian languages. A prominent obstacle that specifically hinders headline generation in Indian languages is the scarcity of high-quality annotated data. To address this crucial gap, we proudly present Mukhyansh, an extensive multilingual dataset, tailored for Indian language headline generation. Comprising an impressive collection of over 3.39 million article-headline pairs, Mukhyansh spans across eight prominent Indian languages, namely Telugu, Tamil, Kannada, Malayalam, Hindi, Bengali, Marathi, and Gujarati. We present a comprehensive evaluation of several state-of-the-art baseline models. Additionally, through an empirical analysis of existing works, we demonstrate that Mukhyansh outperforms all other models, achieving an impressive average ROUGE-L score of 31.43 across all 8 languages.


MuLER: Detailed and Scalable Reference-based Evaluation

arXiv.org Artificial Intelligence

We propose a novel methodology (namely, MuLER) that transforms any reference-based evaluation metric for text generation, such as machine translation (MT) into a fine-grained analysis tool. Given a system and a metric, MuLER quantifies how much the chosen metric penalizes specific error types (e.g., errors in translating names of locations). MuLER thus enables a detailed error analysis which can lead to targeted improvement efforts for specific phenomena. We perform experiments in both synthetic and naturalistic settings to support MuLER's validity and showcase its usability in MT evaluation, and other tasks, such as summarization. Analyzing all submissions to WMT in 2014-2020, we find consistent trends. For example, nouns and verbs are among the most frequent POS tags. However, they are among the hardest to translate. Performance on most POS tags improves with overall system performance, but a few are not thus correlated (their identity changes from language to language). Preliminary experiments with summarization reveal similar trends.


Exploring Human-Like Translation Strategy with Large Language Models

arXiv.org Artificial Intelligence

Large language models (LLMs) have demonstrated impressive capabilities in general scenarios, exhibiting a level of aptitude that approaches, in some aspects even surpasses, human-level intelligence. Among their numerous skills, the translation abilities of LLMs have received considerable attention. Compared to typical machine translation that focuses solely on source-to-target mapping, LLM-based translation can potentially mimic the human translation process which might take preparatory steps to ensure high-quality translation. This work explores this possibility by proposing the MAPS framework, which stands for Multi-Aspect Prompting and Selection. Specifically, we enable LLMs first to analyze the given source sentence and induce three aspects of translation-related knowledge: keywords, topics, and relevant demonstrations to guide the final translation process. Moreover, we employ a selection mechanism based on quality estimation to filter out noisy and unhelpful knowledge. Both automatic (3 LLMs x 11 directions x 2 automatic metrics) and human evaluation (preference study and MQM) demonstrate the effectiveness of MAPS. Further analysis shows that by mimicking the human translation process, MAPS reduces various translation errors such as hallucination, ambiguity, mistranslation, awkward style, untranslated text, and omission. Source code is available at https://github.com/zwhe99/MAPS-mt.