AITopics

2312.001

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > Alameda County > Berkeley (0.14)
Oceania > Australia > Victoria > Melbourne (0.04)
(22 more...)

Genre: Research Report (1.00)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.93)

Pikuliak, Matúš, Hrckova, Andrea, Oresko, Stefan, Šimko, Marián

Women Are Beautiful, Men Are Leaders: Gender Stereotypes in Machine Translation and Language Modeling

arXiv.org Artificial IntelligenceNov-30-2023

We present GEST -- a new dataset for measuring gender-stereotypical reasoning in masked LMs and English-to-X machine translation systems. GEST contains samples that are compatible with 9 Slavic languages and English for 16 gender stereotypes about men and women (e.g., Women are beautiful, Men are leaders). The definition of said stereotypes was informed by gender experts. We used GEST to evaluate 11 masked LMs and 4 machine translation systems. We discovered significant and consistent amounts of stereotypical reasoning in almost all the evaluated models and languages.

computational linguistic, gender, stereotype, (14 more...)

2311.18711

Country:

North America > United States > Washington > King County > Seattle (0.14)
Europe > Italy > Tuscany > Florence (0.04)
Europe > Croatia > Dubrovnik-Neretva County > Dubrovnik (0.04)
(5 more...)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Agrawal, Sweta, Carpuat, Marine

Controlling Pre-trained Language Models for Grade-Specific Text Simplification

arXiv.org Artificial IntelligenceNov-30-2023

Text simplification (TS) systems rewrite text to make it more readable while preserving its content. However, what makes a text easy to read depends on the intended readers. Recent work has shown that pre-trained language models can simplify text using a wealth of techniques to control output simplicity, ranging from specifying only the desired reading grade level, to directly specifying low-level edit operations. Yet it remains unclear how to set these control parameters in practice. Existing approaches set them at the corpus level, disregarding the complexity of individual inputs and considering only one level of output complexity. In this work, we conduct an empirical study to understand how different control mechanisms impact the adequacy and simplicity of text simplification systems. Based on these insights, we introduce a simple method that predicts the edit operations required for simplifying a text for a specific grade level on an instance-per-instance basis. This approach improves the quality of the simplified outputs over corpus-level search-based heuristics.

computational linguistic, control token, simplification, (14 more...)

2305.14993

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.05)
North America > Mexico (0.04)
North America > United States > Maryland (0.04)
(11 more...)

Genre: Research Report (0.82)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.66)

Daily Mail - Science & techNov-29-2023, 19:06:16 GMT

Artificial intelligence can now decipher 'world's oldest languages' that were carved into 5,000-year-old stones as fast as Google translate

The mysterious dialect of our ancient ancestors could finally be deciphered in full thanks to artificial intelligence. A million cuneiform tablets still exist in the world, experts estimate, but these writings left behind by ancient Mesopotamians require tedious work by archaeologists to translate and catalog their contents. It has been estimated that 90 percent of cuneiform texts remain untranslated. But now, a team of German researchers has figured out a new way to train computers to recognize cuneiform and even make the contents of millennia-old tablets searchable like a website, making it possible to digitize and assemble larger libraries of these ancient texts. This could unlock previously unknown details about ancient life, as the tablets contained details about feats as significant as temple construction, all the way down to squabbles as petty as customer service complaints.

000-year-old stone as fast, cuneiform tablet, tablet, (13 more...)

Daily Mail - Science & tech

Country:

Asia > Middle East > Iraq (0.06)
Europe > Germany (0.05)
Asia > Middle East > Syria (0.05)
(3 more...)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.40)

Introduction to Transformers: an NLP Perspective

Xiao, Tong, Zhu, Jingbo

Transformers have dominated empirical machine learning models of natural language processing. In this paper, we introduce basic concepts of Transformers and present key techniques that form the recent advances of these models. This includes a description of the standard Transformer architecture, a series of model refinements, and common applications. Given that Transformers and related deep learning techniques might be evolving in ways we have never seen, we cannot dive into all the model details or cover all the technical areas. Instead, we focus on just those concepts that are helpful for gaining a good understanding of Transformers and their variants. We also summarize the key ideas that impact this field, thereby yielding some insights into the strengths and limitations of these models.

deep neural network, information processing system, self-attention model, (14 more...)

2311.17633

Country:

Asia > China > Liaoning Province > Shenyang (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre:

Research Report (1.00)
Overview (0.93)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(3 more...)

Towards A Foundation Model For Trajectory Intelligence

Najjar, Alameen

Abstract--We present the results of training a large trajectory model using real-world user check-in data. Our approach follows a pre-train and fine-tune paradigm, where a base model is pre-trained via masked trajectory modeling and then adapted through fine-tuning for various downstream tasks. To address challenges posed by noisy data and large spatial vocabularies, we propose a novel spatial tokenization block. Our empirical analysis utilizes a comprehensive dataset of over 2 billion checkins generated by more than 6 million users. Through fine-tuning on 3 downstream tasks we demonstrate that our base model has effectively learned valuable underlying patterns in raw data, enabling its application in meaningful trajectory intelligence tasks. Despite some limitations, we believe this work represents an important step forward in the realization of a foundation model for trajectory intelligence.

fine-tuning, trajectory, trajectory data, (15 more...)

2312.00076

Country: Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science (0.89)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.69)

INarIG: Iterative Non-autoregressive Instruct Generation Model For Word-Level Auto Completion

Shang, Hengchao, Li, Zongyao, Wei, Daimeng, Guo, Jiaxin, Wang, Minghan, Chen, Xiaoyu, Lei, Lizhi, Yang, Hao

Computer-aided translation (CAT) aims to enhance human translation efficiency and is still important in scenarios where machine translation cannot meet quality requirements. One fundamental task within this field is Word-Level Auto Completion (WLAC). WLAC predicts a target word given a source sentence, translation context, and a human typed character sequence. Previous works either employ word classification models to exploit contextual information from both sides of the target word or directly disregarded the dependencies from the right-side context. Furthermore, the key information, i.e. human typed sequences, is only used as prefix constraints in the decoding module. In this paper, we propose the INarIG (Iterative Non-autoregressive Instruct Generation) model, which constructs the human typed sequence into Instruction Unit and employs iterative decoding with subwords to fully utilize input information given in the task. Our model is more competent in dealing with low-frequency words (core scenario of this task), and achieves state-of-the-art results on the WMT22 and benchmark datasets, with a maximum increase of over 10% prediction accuracy.

proceedings, target word, translation, (15 more...)

2311.182

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.05)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(6 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Madasu, Lokesh, Kanumolu, Gopichand, Surange, Nirmal, Shrivastava, Manish

Mukhyansh: A Headline Generation Dataset for Indic Languages

The task of headline generation within the realm of Natural Language Processing (NLP) holds immense significance, as it strives to distill the true essence of textual content into concise and attention-grabbing summaries. While noteworthy progress has been made in headline generation for widely spoken languages like English, there persist numerous challenges when it comes to generating headlines in low-resource languages, such as the rich and diverse Indian languages. A prominent obstacle that specifically hinders headline generation in Indian languages is the scarcity of high-quality annotated data. To address this crucial gap, we proudly present Mukhyansh, an extensive multilingual dataset, tailored for Indian language headline generation. Comprising an impressive collection of over 3.39 million article-headline pairs, Mukhyansh spans across eight prominent Indian languages, namely Telugu, Tamil, Kannada, Malayalam, Hindi, Bengali, Marathi, and Gujarati. We present a comprehensive evaluation of several state-of-the-art baseline models. Additionally, through an empirical analysis of existing works, we demonstrate that Mukhyansh outperforms all other models, achieving an impressive average ROUGE-L score of 31.43 across all 8 languages.

dataset, mukhyansh, transliteration, (15 more...)

2311.17743

Country:

Asia > Japan (0.04)
Asia > India > Uttar Pradesh (0.04)
North America > Canada > Quebec > Montreal (0.04)
(6 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)

MuLER: Detailed and Scalable Reference-based Evaluation

Karidi, Taelin, Choshen, Leshem, Patel, Gal, Abend, Omri

We propose a novel methodology (namely, MuLER) that transforms any reference-based evaluation metric for text generation, such as machine translation (MT) into a fine-grained analysis tool. Given a system and a metric, MuLER quantifies how much the chosen metric penalizes specific error types (e.g., errors in translating names of locations). MuLER thus enables a detailed error analysis which can lead to targeted improvement efforts for specific phenomena. We perform experiments in both synthetic and naturalistic settings to support MuLER's validity and showcase its usability in MT evaluation, and other tasks, such as summarization. Analyzing all submissions to WMT in 2014-2020, we find consistent trends. For example, nouns and verbs are among the most frequent POS tags. However, they are among the hardest to translate. Performance on most POS tags improves with overall system performance, but a few are not thus correlated (their identity changes from language to language). Preliminary experiments with summarization reveal similar trends.

computational linguistic, muler, translation, (14 more...)

2305.14991

Country:

Europe > Finland > Lapland (0.05)
Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(12 more...)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Exploring Human-Like Translation Strategy with Large Language Models

He, Zhiwei, Liang, Tian, Jiao, Wenxiang, Zhang, Zhuosheng, Yang, Yujiu, Wang, Rui, Tu, Zhaopeng, Shi, Shuming, Wang, Xing

Large language models (LLMs) have demonstrated impressive capabilities in general scenarios, exhibiting a level of aptitude that approaches, in some aspects even surpasses, human-level intelligence. Among their numerous skills, the translation abilities of LLMs have received considerable attention. Compared to typical machine translation that focuses solely on source-to-target mapping, LLM-based translation can potentially mimic the human translation process which might take preparatory steps to ensure high-quality translation. This work explores this possibility by proposing the MAPS framework, which stands for Multi-Aspect Prompting and Selection. Specifically, we enable LLMs first to analyze the given source sentence and induce three aspects of translation-related knowledge: keywords, topics, and relevant demonstrations to guide the final translation process. Moreover, we employ a selection mechanism based on quality estimation to filter out noisy and unhelpful knowledge. Both automatic (3 LLMs x 11 directions x 2 automatic metrics) and human evaluation (preference study and MQM) demonstrate the effectiveness of MAPS. Further analysis shows that by mimicking the human translation process, MAPS reduces various translation errors such as hallucination, ambiguity, mistranslation, awkward style, untranslated text, and omission. Source code is available at https://github.com/zwhe99/MAPS-mt.

arxiv preprint, knowledge, translation, (14 more...)

2305.04118

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
Asia > China > Shanghai > Shanghai (0.04)
North America > Canada > Ontario > Toronto (0.04)
(4 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)