AITopics

The advances in automatic sign language translation (SLT) to spoken languages have been mostly benchmarked with datasets of limited size and restricted domains. Our work advances the state of the art by providing the first baseline results on How2Sign, a large and broad dataset. We train a Transformer over I3D video features, using the reduced BLEU as a reference metric for validation, instead of the widely used BLEU score. We report a result of 8.03 on the BLEU score, and publish the first open-source implementation of its kind to promote further advances.

machine learning, natural language, translation, (18 more...)

2304.06371

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
Europe > Spain (0.04)
Europe > Belgium (0.04)
Asia > Japan > Hokkaidō > Hokkaidō Prefecture > Sapporo (0.04)

Genre:

Research Report > New Finding (0.46)
Instructional Material > Course Syllabus & Notes (0.40)

Industry:

Education > Educational Technology (0.86)
Education > Curriculum > Subject-Specific Education (0.69)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Thulke, David, Daheim, Nico, Dugast, Christian, Ney, Hermann

Task-oriented Document-Grounded Dialog Systems by HLTPR@RWTH for DSTC9 and DSTC10

This paper summarizes our contributions to the document-grounded dialog tasks at the 9th and 10th Dialog System Technology Challenges (DSTC9 and DSTC10). In both iterations the task consists of three subtasks: first detect whether the current turn is knowledge seeking, second select a relevant knowledge document, and third generate a response grounded on the selected document. For DSTC9 we proposed different approaches to make the selection task more efficient. The best method, Hierarchical Selection, actually improves the results compared to the original baseline and gives a speedup of 24x. In the DSTC10 iteration of the task, the challenge was to adapt systems trained on written dialogs to perform well on noisy automatic speech recognition transcripts. Therefore, we proposed data augmentation techniques to increase the robustness of the models as well as methods to adapt the style of generated responses to fit well into the proceeding dialog. Additionally, we proposed a noisy channel model that allows for increasing the factuality of the generated responses. In addition to summarizing our previous contributions, in this work, we also report on a few small improvements and reconsider the automatic evaluation metrics for the generation task which have shown a low correlation to human judgments.

computational linguistic, machine learning, natural language, (19 more...)

2304.07101

Country:

North America > United States > California > San Francisco County > San Francisco (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
South America > Colombia (0.04)
(4 more...)

Genre: Research Report (0.64)

Industry: Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.93)
(2 more...)

Gundersen, Odd Erik, Coakley, Kevin, Kirkpatrick, Christine, Gil, Yolanda

Sources of Irreproducibility in Machine Learning: A Review

Background: Many published machine learning studies are irreproducible. Issues with methodology and not properly accounting for variation introduced by the algorithm themselves or their implementations are attributed as the main contributors to the irreproducibility.Problem: There exist no theoretical framework that relates experiment design choices to potential effects on the conclusions. Without such a framework, it is much harder for practitioners and researchers to evaluate experiment results and describe the limitations of experiments. The lack of such a framework also makes it harder for independent researchers to systematically attribute the causes of failed reproducibility experiments. Objective: The objective of this paper is to develop a framework that enable applied data science practitioners and researchers to understand which experiment design choices can lead to false findings and how and by this help in analyzing the conclusions of reproducibility experiments. Method: We have compiled an extensive list of factors reported in the literature that can lead to machine learning studies being irreproducible. These factors are organized and categorized in a reproducibility framework motivated by the stages of the scientific method. The factors are analyzed for how they can affect the conclusions drawn from experiments. A model comparison study is used as an example. Conclusion: We provide a framework that describes machine learning methodology from experimental design decisions to the conclusions inferred from them.

experiment, machine learning, natural language, (19 more...)

2204.0761

Country:

North America > United States > California > San Diego County > San Diego (0.04)
Europe > Norway > Central Norway > Trøndelag > Trondheim (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
(4 more...)

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.67)

Salemi, Alireza, Abaskohi, Amirhossein, Tavakoli, Sara, Yaghoobzadeh, Yadollah, Shakery, Azadeh

PEACH: Pre-Training Sequence-to-Sequence Multilingual Models for Translation with Semi-Supervised Pseudo-Parallel Document Generation

Multilingual pre-training significantly improves many multilingual NLP tasks, including machine translation. Most existing methods are based on some variants of masked language modeling and text-denoising objectives on monolingual data. Multilingual pre-training on monolingual data ignores the availability of parallel data in many language pairs. Also, some other works integrate the available human-generated parallel translation data in their pre-training. This kind of parallel data is definitely helpful, but it is limited even in high-resource language pairs. This paper introduces a novel semi-supervised method, SPDG, that generates high-quality pseudo-parallel data for multilingual pre-training. First, a denoising model is pre-trained on monolingual data to reorder, add, remove, and substitute words, enhancing the pre-training documents' quality. Then, we generate different pseudo-translations for each pre-training document using dictionaries for word-by-word translation and applying the pre-trained denoising model. The resulting pseudo-parallel data is then used to pre-train our multilingual sequence-to-sequence model, PEACH. Our experiments show that PEACH outperforms existing approaches used in training mT5 and mBART on various translation tasks, including supervised, zero- and few-shot scenarios. Moreover, PEACH's ability to transfer knowledge between similar languages makes it particularly useful for low-resource languages. Our results demonstrate that with high-quality dictionaries for generating accurate pseudo-parallel, PEACH can be valuable for low-resource languages.

objective, peach, translation, (15 more...)

2304.01282

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
Asia > Middle East > Iran > Tehran Province > Tehran (0.04)
(11 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Liguori, Pietro, Improta, Cristina, Natella, Roberto, Cukic, Bojan, Cotroneo, Domenico

Who Evaluates the Evaluators? On Automatic Metrics for Assessing AI-based Offensive Code Generators

arXiv.org Artificial IntelligenceApr-13-2023

AI-based code generators are an emerging solution for automatically writing programs starting from descriptions in natural language, by using deep neural networks (Neural Machine Translation, NMT). In particular, code generators have been used for ethical hacking and offensive security testing by generating proof-of-concept attacks. Unfortunately, the evaluation of code generators still faces several issues. The current practice uses output similarity metrics, i.e., automatic metrics that compute the textual similarity of generated code with ground-truth references. However, it is not clear what metric to use, and which metric is most suitable for specific contexts. This work analyzes a large set of output similarity metrics on offensive code generators. We apply the metrics on two state-of-the-art NMT models using two datasets containing offensive assembly and Python code with their descriptions in the English language. We compare the estimates from the automatic metrics with human evaluation and provide practical insights into their strengths and limitations.

machine learning, natural language, programming language, (19 more...)

doi: 10.1016/j.eswa.2023.120073

2212.06008

Country:

North America > United States > California > San Diego County > San Diego (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Asia > Singapore > Central Region > Singapore (0.04)
(16 more...)

Genre: Research Report > New Finding (0.67)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceApr-13-2023

PePe: Personalized Post-editing Model utilizing User-generated Post-edits

Lee, Jihyeon, Kim, Taehee, Tae, Yunwon, Park, Cheonbok, Choo, Jaegul

Incorporating personal preference is crucial in advanced machine translation tasks. Despite the recent advancement of machine translation, it remains a demanding task to properly reflect personal style. In this paper, we introduce a personalized automatic post-editing framework to address this challenge, which effectively generates sentences considering distinct Figure 1: Example of a personal post-editing triplet personal behaviors. To build this framework, (i.e., source (src), machine translation (mt), and postedit we first collect post-editing data that connotes (pe)) given the source text in English and the translated the user preference from a live machine translation text in Korean. A post-edited sentence does not system. Specifically, real-world users enter only contain error correction of an initial machine translation source sentences for translation and edit result but also reflects individual preference. For the machine-translated outputs according to instance, a human post-editor modifies the word "primarily" the user's preferred style. We then propose to "primary," but also change " 공헌 " to its synonym a model that combines a discriminator module " 기여 " while keeping the rest as it is (e.g., "research").

artificial intelligence, natural language, translation, (18 more...)

2209.10139

Country:

North America > United States (0.14)
Asia > Singapore (0.04)
Asia > Malaysia (0.04)
Europe > Czechia > Prague (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Information Technology > Security & Privacy (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Learning Homographic Disambiguation Representation for Neural Machine Translation

Wang, Weixuan, Peng, Wei, Liu, Qun

Homographs, words with the same spelling but different meanings, remain challenging in Neural Machine Translation (NMT). While recent works leverage various word embedding approaches to differentiate word sense in NMT, they do not focus on the pivotal components in resolving ambiguities of homographs in NMT: the hidden states of an encoder. In this paper, we propose a novel approach to tackle homographic issues of NMT in the latent space. We first train an encoder (aka "HDR-encoder") to learn universal sentence representations in a natural language inference (NLI) task. We further fine-tune the encoder using homograph-based synset sentences from WordNet, enabling it to learn word-level homographic disambiguation representations (HDR). The pre-trained HDR-encoder is subsequently integrated with a transformer-based NMT in various schemes to improve translation accuracy. Experiments on four translation directions demonstrate the effectiveness of the proposed method in enhancing the performance of NMT systems in the BLEU scores (up to +2.3 compared to a solid baseline). The effects can be verified by other metrics (F1, precision, and recall) of translation accuracy in an additional disambiguation task. Visualization methods like heatmaps, T-SNE and translation examples are also utilized to demonstrate the effects of the proposed method.

artificial intelligence, machine translation, natural language, (15 more...)

2304.0586

Country:

Europe > Denmark > Capital Region > Copenhagen (0.04)
Europe > Czechia > Prague (0.04)
Asia > North Korea (0.04)
(11 more...)

Genre: Research Report (1.00)

Industry: Government (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Robertson, Samantha, Wang, Zijie J., Moritz, Dominik, Kery, Mary Beth, Hohman, Fred

Angler: Helping Machine Translation Practitioners Prioritize Model Improvements

Machine learning (ML) models can fail in unexpected ways in the real world, but not all model failures are equal. With finite time and resources, ML practitioners are forced to prioritize their model debugging and improvement efforts. Through interviews with 13 ML practitioners at Apple, we found that practitioners construct small targeted test sets to estimate an error's nature, scope, and impact on users. We built on this insight in a case study with machine translation models, and developed Angler, an interactive visual analytics tool to help practitioners prioritize model improvements. In a user study with 7 machine translation experts, we used Angler to understand prioritization practices when the input space is infinite, and obtaining reliable signals of model quality is expensive. Our study revealed that participants could form more interesting and user-focused hypotheses for prioritization by analyzing quantitative summary statistics and qualitatively assessing data by reading sentences.

data mining, machine learning, natural language, (18 more...)

doi: 10.1145/3544548.3580790

2304.05967

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
Europe > Germany > Hamburg (0.05)
Europe > Czechia > Prague (0.04)
(11 more...)

Genre:

Questionnaire & Opinion Survey (1.00)
Research Report > New Finding (0.67)
Research Report > Promising Solution (0.46)

Industry:

Health & Medicine (1.00)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

NusaX: Multilingual Parallel Sentiment Dataset for 10 Indonesian Local Languages

Winata, Genta Indra, Aji, Alham Fikri, Cahyawijaya, Samuel, Mahendra, Rahmad, Koto, Fajri, Romadhony, Ade, Kurniawan, Kemal, Moeljadi, David, Prasojo, Radityo Eko, Fung, Pascale, Baldwin, Timothy, Lau, Jey Han, Sennrich, Rico, Ruder, Sebastian

Natural language processing (NLP) has a significant impact on society via technologies such as machine translation and search engines. Despite its success, NLP technology is only widely available for high-resource languages such as English and Chinese, while it remains inaccessible to many languages due to the unavailability of data resources and benchmarks. In this work, we focus on developing resources for languages in Indonesia. Despite being the second most linguistically diverse country, most languages in Indonesia are categorized as endangered and some are even extinct. We develop the first-ever parallel resource for 10 low-resource languages in Indonesia. Our resource includes datasets, a multi-task benchmark, and lexicons, as well as a parallel Indonesian-English dataset. We provide extensive analyses and describe the challenges when creating such resources. We hope that our work can spark NLP research on Indonesian and other underrepresented languages.

computational linguistic, machine learning, natural language, (19 more...)

2205.1596

Country:

Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Europe > Germany > Saxony > Leipzig (0.04)
Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
(30 more...)

Genre: Research Report (0.82)

Industry: Education > Educational Setting (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Sharma, Mandar, Gogineni, Ajay, Ramakrishnan, Naren

Innovations in Neural Data-to-text Generation: A Survey

The neural boom that has sparked natural language processing (NLP) research through the last decade has similarly led to significant innovations in data-to-text generation (DTG). This survey offers a consolidated view into the neural DTG paradigm with a structured examination of the approaches, benchmark datasets, and evaluation protocols. This survey draws boundaries separating DTG from the rest of the natural language generation (NLG) landscape, encompassing an up-to-date synthesis of the literature, and highlighting the stages of technological adoption from within and outside the greater NLG umbrella. With this holistic view, we highlight promising avenues for DTG research that not only focus on the design of linguistically capable systems but also systems that exhibit fairness and accountability.

large language model, machine learning, natural language, (18 more...)

2207.12571

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
North America > United States > Virginia > Arlington County > Arlington (0.04)
North America > United States > Kansas (0.04)
(7 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Leisure & Entertainment > Sports (1.00)
Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(7 more...)