AITopics

2312.07419

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Ireland > Leinster > County Dublin > Dublin (0.05)
North America > United States > New Mexico > Santa Fe County > Santa Fe (0.04)
(4 more...)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (1.00)

arXiv.org Artificial IntelligenceDec-12-2023

Neural Machine Translation of Clinical Text: An Empirical Investigation into Multilingual Pre-Trained Language Models and Transfer-Learning

Han, Lifeng, Gladkoff, Serge, Erofeev, Gleb, Sorokina, Irina, Galiano, Betty, Nenadic, Goran

We conduct investigations on clinical text machine translation by examining multilingual neural network models using deep learning such as Transformer based structures. Furthermore, to address the language resource imbalance issue, we also carry out experiments using a transfer learning methodology based on massive multilingual pre-trained language models (MMPLMs). The experimental results on three subtasks including 1) clinical case (CC), 2) clinical terminology (CT), and 3) ontological concept (OC) show that our models achieved top-level performances in the ClinSpEn-2022 shared task on English-Spanish clinical domain data. Furthermore, our expert-based human evaluations demonstrate that the small-sized pre-trained language model (PLM) won over the other two extra-large language models by a large margin, in the clinical domain fine-tuning, which finding was never reported in the field. Finally, the transfer learning method works well in our experimental setting using the WMT21fb model to accommodate a new language space Spanish that was not seen at the pre-training stage within WMT21fb itself, which deserves more exploitation for clinical knowledge transformation, e.g. to investigate into more languages. These research findings can shed some light on domain-specific machine translation development, especially in clinical and healthcare fields. Further research projects can be carried out based on our work to improve healthcare text analytics and knowledge transformation.

feature, machine translation, translation, (14 more...)

2312.0725

Country:

Europe > Finland > Uusimaa > Helsinki (0.05)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(16 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area (0.68)
Health & Medicine > Health Care Technology (0.68)
Health & Medicine > Health Care Providers & Services (0.46)
Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)

Unify word-level and span-level tasks: NJUNLP's Participation for the WMT2023 Quality Estimation Shared Task

Geng, Xiang, Lai, Zhejian, Zhang, Yu, Tao, Shimin, Yang, Hao, Chen, Jiajun, Huang, Shujian

We introduce the submissions of the NJUNLP team to the WMT 2023 Quality Estimation (QE) shared task. Our team submitted predictions for the English-German language pair on all two sub-tasks: (i) sentence- and word-level quality prediction; and (ii) fine-grained error span detection. This year, we further explore pseudo data methods for QE based on NJUQE framework (https://github.com/NJUNLP/njuqe). We generate pseudo MQM data using parallel data from the WMT translation task. We pre-train the XLMR large model on pseudo QE data, then fine-tune it on real QE data. At both stages, we jointly learn sentence-level scores and word-level tags. Empirically, we conduct experiments to find the key hyper-parameters that improve the performance. Technically, we propose a simple method that covert the word-level outputs to fine-grained error span results. Overall, our models achieved the best results in English-German for both word-level and fine-grained error span detection sub-tasks by a considerable margin.

pseudo mqm data, severity, translation, (12 more...)

2309.1323

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.05)
Asia > China > Jiangsu Province > Nanjing (0.05)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Alzamzami, Fatimah, Saddik, Abdulmotaleb El

Content-Localization based Neural Machine Translation for Informal Dialectal Arabic: Spanish/French to Levantine/Gulf Arabic

Resources in high-resource languages have not been efficiently exploited in low-resource languages to solve language-dependent research problems. Spanish and French are considered high resource languages in which an adequate level of data resources for informal online social behavior modeling, is observed. However, a machine translation system to access those data resources and transfer their context and tone to a low-resource language like dialectal Arabic, does not exist. In response, we propose a framework that localizes contents of high-resource languages to a low-resource language/dialects by utilizing AI power. To the best of our knowledge, we are the first work to provide a parallel translation dataset from/to informal Spanish and French to/from informal Arabic dialects. Using this, we aim to enrich the under-resource-status dialectal Arabic and fast-track the research of diverse online social behaviors within and across smart cities in different geo-regions. The experimental results have illustrated the capability of our proposed solution in exploiting the resources between high and low resource languages and dialects. Not only this, but it has also been proven that ignoring dialects within the same language could lead to misleading analysis of online social behavior.

dataset, dialect, translation, (16 more...)

2312.06926

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
North America > Canada > Ontario > National Capital Region > Ottawa (0.04)
Europe > Ukraine > Kyiv Oblast > Kyiv (0.04)
Asia > Middle East > Yemen > Amanat Al Asimah > Sanaa (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Label Smoothing for Enhanced Text Sentiment Classification

Gao, Yijie, Si, Shijing

Label smoothing (LS) proceeds by using soft targets that are a weighted average of the hard Label smoothing is a widely used technique targets and the uniform distribution over labels in various domains, such as image (Müller et al., 2019; Lienen and Hüllermeier, classification and speech recognition, 2021). This technique lessens the disparity between known for effectively combating model the top probability estimate and the remaining overfitting. However, there is few research ones, thereby acting as a barrier to the model on its application to text sentiment from generating extremely confident predictions, classification. To fill in the gap, this consequently decreasing the model's likelihood of study investigates the implementation of becoming excessively tailored to the training data label smoothing for sentiment classification (Lukasik et al., 2020; Gao et al., 2020).

architecture, classification, sentiment classification, (17 more...)

2312.06522

Country:

Asia > China > Shanghai > Shanghai (0.04)
Asia > Japan > Honshū > Chūbu > Nagano Prefecture > Nagano (0.04)
Asia > Indonesia (0.04)
Asia > Bangladesh (0.04)

Genre:

Research Report (0.82)
Overview (0.68)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.89)
(2 more...)

Order Matters in the Presence of Dataset Imbalance for Multilingual Learning

Choi, Dami, Xin, Derrick, Dadkhahi, Hamid, Gilmer, Justin, Garg, Ankush, Firat, Orhan, Yeh, Chih-Kuan, Dai, Andrew M., Ghorbani, Behrooz

In this paper, we empirically study the optimization dynamics of multi-task learning, particularly focusing on those that govern a collection of tasks with significant data imbalance. We present a simple yet effective method of pre-training on high-resource tasks, followed by fine-tuning on a mixture of high/low-resource tasks. We provide a thorough empirical study and analysis of this method's benefits showing that it achieves consistent improvements relative to the performance trade-off profile of standard static weighting. We analyze under what data regimes this method is applicable and show its improvements empirically in neural machine translation (NMT) and multi-lingual language modeling.

cross-entropy loss, train cross-entropy loss, valid cross-entropy loss, (13 more...)

2312.06134

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

arXiv.org Artificial IntelligenceDec-9-2023

Non-Autoregressive Document-Level Machine Translation

Bao, Guangsheng, Teng, Zhiyang, Zhou, Hao, Yan, Jianhao, Zhang, Yue

Non-autoregressive translation (NAT) models achieve comparable performance and superior speed compared to auto-regressive translation (AT) models in the context of sentence-level machine translation (MT). However, their abilities are unexplored in document-level MT, hindering their usage in real scenarios. In this paper, we conduct a comprehensive examination of typical NAT models in the context of document-level MT and further propose a simple but effective design of sentence alignment between source and target. Experiments show that NAT models achieve high acceleration on documents, and sentence alignment significantly enhances their performance. However, current NAT models still have a significant performance gap compared to their AT counterparts. Further investigation reveals that NAT models suffer more from the multi-modality and misalignment issues in the context of document-level MT, and current NAT models struggle with exploiting document context and handling discourse phenomena. We delve into these challenges and provide our code at \url{https://github.com/baoguangsheng/nat-on-doc}.

alignment, nat model, translation, (16 more...)

2305.12878

Country:

Asia > China (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)

Genre: Research Report > Experimental Study (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

arXiv.org Artificial IntelligenceDec-9-2023

Ascle: A Python Natural Language Processing Toolkit for Medical Text Generation

Yang, Rui, Zeng, Qingcheng, You, Keen, Qiao, Yujie, Huang, Lucas, Hsieh, Chia-Chun, Rosand, Benjamin, Goldwasser, Jeremy, Dave, Amisha D, Keenan, Tiarnan D. L., Chew, Emily Y, Radev, Dragomir, Lu, Zhiyong, Xu, Hua, Chen, Qingyu, Li, Irene

This study introduces Ascle, a pioneering natural language processing (NLP) toolkit designed for medical text generation. Ascle is tailored for biomedical researchers and healthcare professionals with an easy-to-use, all-in-one solution that requires minimal programming expertise. For the first time, Ascle evaluates and provides interfaces for the latest pre-trained language models, encompassing four advanced and challenging generative functions: question-answering, text summarization, text simplification, and machine translation. In addition, Ascle integrates 12 essential NLP functions, along with query and search capabilities for clinical databases. The toolkit, its models, and associated data are publicly available via https://github.com/Yale-LILY/MedGen.

ascle, dataset, summarization, (13 more...)

2311.16588

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > Washington > King County > Redmond (0.05)
North America > United States > Connecticut > New Haven County > New Haven (0.04)
(5 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine (1.00)
Health & Medicine > Consumer Health (0.93)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

arXiv.org Artificial IntelligenceDec-8-2023

Seamless: Multilingual Expressive and Streaming Speech Translation

Communication, Seamless, Barrault, Loïc, Chung, Yu-An, Meglioli, Mariano Coria, Dale, David, Dong, Ning, Duppenthaler, Mark, Duquenne, Paul-Ambroise, Ellis, Brian, Elsahar, Hady, Haaheim, Justin, Hoffman, John, Hwang, Min-Jae, Inaguma, Hirofumi, Klaiber, Christopher, Kulikov, Ilia, Li, Pengwei, Licht, Daniel, Maillard, Jean, Mavlyutov, Ruslan, Rakotoarison, Alice, Sadagopan, Kaushik Ram, Ramakrishnan, Abinesh, Tran, Tuan, Wenzek, Guillaume, Yang, Yilin, Ye, Ethan, Evtimov, Ivan, Fernandez, Pierre, Gao, Cynthia, Hansanti, Prangthip, Kalbassi, Elahe, Kallet, Amanda, Kozhevnikov, Artyom, Gonzalez, Gabriel Mejia, Roman, Robin San, Touret, Christophe, Wong, Corinne, Wood, Carleigh, Yu, Bokai, Andrews, Pierre, Balioglu, Can, Chen, Peng-Jen, Costa-jussà, Marta R., Elbayad, Maha, Gong, Hongyu, Guzmán, Francisco, Heffernan, Kevin, Jain, Somya, Kao, Justine, Lee, Ann, Ma, Xutai, Mourachko, Alex, Peloquin, Benjamin, Pino, Juan, Popuri, Sravya, Ropers, Christophe, Saleem, Safiyyah, Schwenk, Holger, Sun, Anna, Tomasello, Paden, Wang, Changhan, Wang, Jeff, Wang, Skyler, Williamson, Mary

Large-scale automatic speech translation systems today lack key features that help machine-mediated communication feel seamless when compared to human-to-human dialogue. In this work, we introduce a family of models that enable end-to-end expressive and multilingual translations in a streaming fashion. First, we contribute an improved version of the massively multilingual and multimodal SeamlessM4T model-SeamlessM4T v2. This newer model, incorporating an updated UnitY2 framework, was trained on more low-resource language data. SeamlessM4T v2 provides the foundation on which our next two models are initiated. SeamlessExpressive enables translation that preserves vocal styles and prosody. Compared to previous efforts in expressive speech research, our work addresses certain underexplored aspects of prosody, such as speech rate and pauses, while also preserving the style of one's voice. As for SeamlessStreaming, our model leverages the Efficient Monotonic Multihead Attention mechanism to generate low-latency target translations without waiting for complete source utterances. As the first of its kind, SeamlessStreaming enables simultaneous speech-to-speech/text translation for multiple source and target languages. To ensure that our models can be used safely and responsibly, we implemented the first known red-teaming effort for multimodal machine translation, a system for the detection and mitigation of added toxicity, a systematic evaluation of gender bias, and an inaudible localized watermarking mechanism designed to dampen the impact of deepfakes. Consequently, we bring major components from SeamlessExpressive and SeamlessStreaming together to form Seamless, the first publicly available system that unlocks expressive cross-lingual communication in real-time. The contributions to this work are publicly released and accessible at https://github.com/facebookresearch/seamless_communication

detection and mitigation, readiness and information access, seamlessm4t-large system, (16 more...)

2312.05187

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.13)
Europe > Italy > Tuscany > Florence (0.04)
North America > Canada > Ontario > Toronto (0.04)
(19 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.92)

arXiv.org Artificial IntelligenceDec-7-2023

Efficient Monotonic Multihead Attention

Ma, Xutai, Sun, Anna, Ouyang, Siqi, Inaguma, Hirofumi, Tomasello, Paden

We introduce the Efficient Monotonic Multihead Attention (EMMA), a state-of-the-art simultaneous translation model with numerically-stable and unbiased monotonic alignment estimation. In addition, we present improved training and inference strategies, including simultaneous fine-tuning from an offline translation model and reduction of monotonic alignment variance. The experimental results demonstrate that the proposed model attains state-of-the-art performance in simultaneous speech-to-text translation on the Spanish and English translation task.

computational linguistic, estimation, translation, (12 more...)

2312.04515

Country:

Europe > Italy > Tuscany > Florence (0.04)
Asia > Middle East > Jordan (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
(9 more...)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)