AITopics

2303.00969

Country:

Asia > China > Beijing > Beijing (0.05)
Asia > India > Karnataka > Bengaluru (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

arXiv.org Artificial IntelligenceMar-11-2023

Named Entity Detection and Injection for Direct Speech Translation

Gaido, Marco, Tang, Yun, Kulikov, Ilia, Huang, Rongqing, Gong, Hongyu, Inaguma, Hirofumi

In a sentence, certain words are critical for its semantic. Among them, named entities (NEs) are notoriously challenging for neural models. Despite their importance, their accurate handling has been neglected in speech-to-text (S2T) translation research, and recent work has shown that S2T models perform poorly for locations and notably person names, whose spelling is challenging unless known in advance. In this work, we explore how to leverage dictionaries of NEs known to likely appear in a given context to improve S2T model outputs. Our experiments show that we can reliably detect NEs likely present in an utterance starting from S2T encoder outputs. Indeed, we demonstrate that the current detection quality is sufficient to improve NE accuracy in the translation with a 31% reduction in person name errors.

artificial intelligence, natural language, utterance, (18 more...)

2210.11981

Country:

Asia > Malaysia (0.28)
Europe > Ireland > Leinster > County Dublin > Dublin (0.05)
Europe > Portugal > Lisbon > Lisbon (0.04)
(11 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Friedrich, Annemarie, Xue, Nianwen, Palmer, Alexis

A Kind Introduction to Lexical and Grammatical Aspect, with a Survey of Computational Approaches

arXiv.org Artificial IntelligenceMar-10-2023

Aspectual meaning refers to how the internal temporal structure of situations is presented. This includes whether a situation is described as a state or as an event, whether the situation is finished or ongoing, and whether it is viewed as a whole or with a focus on a particular phase. This survey gives an overview of computational approaches to modeling lexical and grammatical aspect along with intuitive explanations of the necessary linguistic concepts and terminology. In particular, we describe the concepts of stativity, telicity, habituality, perfective and imperfective, as well as influential inventories of eventuality and situation types. We argue that because aspect is a crucial component of semantics, especially when it comes to reporting the temporal structure of situations in a precise way, future NLP approaches need to be able to handle and evaluate it systematically in order to achieve human-level language understanding.

computational linguistic, large language model, machine learning, (23 more...)

2208.09012

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
(37 more...)

Genre: Overview (1.00)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
(7 more...)

Ando, Kenichiro, Komachi, Mamoru, Okumura, Takashi, Horiguchi, Hiromasa, Matsumoto, Yuji

Is In-hospital Meta-information Useful for Abstractive Discharge Summary Generation?

arXiv.org Artificial IntelligenceMar-10-2023

During the patient's hospitalization, the physician must record daily observations of the patient and summarize them into a brief document called "discharge summary" when the patient is discharged. Automated generation of discharge summary can greatly relieve the physicians' burden, and has been addressed recently in the research community. Most previous studies of discharge summary generation using the sequence-to-sequence architecture focus on only inpatient notes for input. However, electric health records (EHR) also have rich structured metadata (e.g., hospital, physician, disease, length of stay, etc.) that might be useful. This paper investigates the effectiveness of medical meta-information for summarization tasks. We obtain four types of meta-information from the EHR systems and encode each meta-information into a sequence-to-sequence model. Using Japanese EHRs, meta-information encoded models increased ROUGE-1 by up to 4.45 points and BERTScore by 3.77 points over the vanilla Longformer. Also, we found that the encoded meta-information improves the precisions of its related terms in the outputs. Our results showed the benefit of the use of medical meta-information.

artificial intelligence, machine learning, natural language, (18 more...)

doi: 10.1109/TAAI57707.2022.00034

2303.06002

Country: Asia > Japan (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Health Care Technology > Medical Record (0.90)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)

#artificialintelligenceMar-9-2023, 22:38:44 GMT

XLM -- Enhancing BERT for Cross-lingual Language Model

Attention models, and BERT in particular, have achieved promising results in Natural Language Processing, in both classification and translation tasks. A new paper by Facebook AI, named XLM, presents an improved version of BERT to achieve state-of-the-art results in both types of tasks. XLM uses a known pre-processing technique (BPE) and a dual-language training mechanism with BERT in order to learn relations between words in different languages. The model outperforms other models in a cross-lingual classification task (sentence entailment in 15 languages) and significantly improves machine translation when a pre-trained model is used for initialization of the translation model. XLM is based on several key concepts: Transformers, invented in 2017, introduced an attention mechanism that processes the entire text input simultaneously to learn contextual relations between words (or sub-words).

bert, transformer, translation model, (13 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.51)

MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup for Visual Speech Translation and Recognition

Cheng, Xize, Li, Linjun, Jin, Tao, Huang, Rongjie, Lin, Wang, Wang, Zehan, Liu, Huangdai, Wang, Ye, Yin, Aoxiong, Zhao, Zhou

Multi-media communications facilitate global interaction among people. However, despite researchers exploring cross-lingual translation techniques such as machine translation and audio speech translation to overcome language barriers, there is still a shortage of cross-lingual studies on visual speech. This lack of research is mainly due to the absence of datasets containing visual speech and translated text pairs. In this paper, we present \textbf{AVMuST-TED}, the first dataset for \textbf{A}udio-\textbf{V}isual \textbf{Mu}ltilingual \textbf{S}peech \textbf{T}ranslation, derived from \textbf{TED} talks. Nonetheless, visual speech is not as distinguishable as audio speech, making it difficult to develop a mapping from source speech phonemes to the target language text. To address this issue, we propose MixSpeech, a cross-modality self-learning framework that utilizes audio speech to regularize the training of visual speech tasks. To further minimize the cross-modality gap and its impact on knowledge transfer, we suggest adopting mixed speech, which is created by interpolating audio and visual streams, along with a curriculum learning strategy to adjust the mixing ratio as needed. MixSpeech enhances speech translation in noisy environments, improving BLEU scores for four languages on AVMuST-TED by +1.4 to +4.2. Moreover, it achieves state-of-the-art performance in lip reading on CMLR (11.1\%), LRS2 (25.5\%), and LRS3 (28.0\%).

artificial intelligence, machine translation, natural language, (17 more...)

2303.05309

Country:

Europe > Belgium (0.04)
Asia > Taiwan > Taiwan Province > Taipei (0.04)

Genre: Research Report (0.82)

Industry: Education > Educational Setting > Online (0.48)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Syntax and Domain Aware Model for Unsupervised Program Translation

Liu, Fang, Li, Jia, Zhang, Li

There is growing interest in software migration as the development of software and society. Manually migrating projects between languages is error-prone and expensive. In recent years, researchers have begun to explore automatic program translation using supervised deep learning techniques by learning from large-scale parallel code corpus. However, parallel resources are scarce in the programming language domain, and it is costly to collect bilingual data manually. To address this issue, several unsupervised programming translation systems are proposed. However, these systems still rely on huge monolingual source code to train, which is very expensive. Besides, these models cannot perform well for translating the languages that are not seen during the pre-training procedure. In this paper, we propose SDA-Trans, a syntax and domain-aware model for program translation, which leverages the syntax structure and domain knowledge to enhance the cross-lingual transfer ability. SDA-Trans adopts unsupervised training on a smaller-scale corpus, including Python and Java monolingual programs. The experimental results on function translation tasks between Python, Java, and C++ show that SDA-Trans outperforms many large-scale pre-trained models, especially for unseen language translation.

machine learning, natural language, translation, (20 more...)

2302.03908

Country:

North America > United States (0.14)
Asia > China > Beijing > Beijing (0.04)
Oceania > Australia (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Unsupervised Language agnostic WER Standardization

Guha, Satarupa, Ambavat, Rahul, Gupta, Ankur, Gupta, Manish, Mehta, Rupeshkumar

Word error rate (WER) is a standard metric for the evaluation of Automated Speech Recognition (ASR) systems. However, WER fails to provide a fair evaluation of human perceived quality in presence of spelling variations, abbreviations, or compound words arising out of agglutination. Multiple spelling variations might be acceptable based on locale/geography, alternative abbreviations, borrowed words, and transliteration of code-mixed words from a foreign language to the target language script. Similarly, in case of agglutination, often times the agglutinated, as well as the split forms, are acceptable. Previous work handled this problem by using manually identified normalization pairs and applying them to both the transcription and the hypothesis before computing WER. In this paper, we propose an automatic WER normalization system consisting of two modules: spelling normalization and segmentation normalization. The proposed system is unsupervised and language agnostic, and therefore scalable. Experiments with ASR on 35K utterances across four languages yielded an average WER reduction of 13.28%. Human judgements of these automatically identified normalization pairs show that our WER-normalized evaluation is highly consistent with the perceived quality of ASR output.

machine learning, natural language, variation, (20 more...)

2303.05046

Country:

North America > United States (0.04)
Europe > Sweden > Vaestra Goetaland > Gothenburg (0.04)
Europe > Portugal > Lisbon > Lisbon (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.70)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.69)

Paraphrasing Techniques for Maritime QA system

Shiri, Fatemeh, Zhuo, Terry Yue, Li, Zhuang, Nguyen, Van, Pan, Shirui, Wang, Weiqing, Haffari, Reza, Li, Yuan-Fang

There has been an increasing interest in incorporating Artificial Intelligence (AI) into Defence and military systems to complement and augment human intelligence and capabilities. However, much work still needs to be done toward achieving an effective human-machine partnership. This work is aimed at enhancing human-machine communications by developing a capability for automatically translating human natural language into a machine-understandable language (e.g., SQL queries). Techniques toward achieving this goal typically involve building a semantic parser trained on a very large amount of high-quality manually-annotated data. However, in many real-world Defence scenarios, it is not feasible to obtain such a large amount of training data. To the best of our knowledge, there are few works trying to explore the possibility of training a semantic parser with limited manually-paraphrased data, in other words, zero-shot. In this paper, we investigate how to exploit paraphrasing methods for the automated generation of large-scale training datasets (in the form of paraphrased utterances and their corresponding logical forms in SQL format) and present our experimental results using real-world data in the maritime domain.

large language model, machine learning, natural language, (16 more...)

2203.10854

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
Oceania > Australia > South Australia > Adelaide (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
(2 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.96)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.92)

North, Kai, Zampieri, Marcos, Shardlow, Matthew

Lexical Complexity Prediction: An Overview

arXiv.org Artificial IntelligenceMar-8-2023

Understanding the meaning of words in context is fundamental for reading comprehension. The perceived difficulty, hereafter referred to as complexity, of a target word within a given text varies widely among readers. With an increased demand for distance learning and educational technologies[107], research into automatically predicting which words are likely to cause comprehension problems is becoming a popular area of research [115, 147, 185]. Systems have been created to identify complex words that are difficult to acquire, reproduce, or understand for children [79], second-language learners [89], people suffering from a reading disability, such as dyslexia [131] or aphasia [35, 53], or more generally, individuals with low literacy [59, 175]. In Computational Linguistics and Natural Language Processing (NLP), the task of automatically recognizing complex words is most often achieved by training machine learning (ML) models. These ML models assign a complexity value to each target word within an inputted extract, sentence, or text that allows for the identification of complex words. This information can then be used to improve downstream lexical and text simplification systems that provide simpler alternatives to aid reading comprehension. Take the extract shown in Table 1 for example.

artificial intelligence, machine learning, natural language, (16 more...)

doi: 10.1145/3557885

2303.04851

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
Asia > Thailand > Bangkok > Bangkok (0.05)
(44 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)
Instructional Material (0.87)

Industry:

Health & Medicine (1.00)
Education > Educational Technology > Educational Software > Computer Based Training (0.92)
Education > Educational Setting > Online (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(4 more...)