AITopics

2308.00683

Country:

North America > Dominican Republic (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(6 more...)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Towards Effective Ancient Chinese Translation: Dataset, Model, and Evaluation

Guo, Geyang, Yang, Jiarong, Lu, Fengyuan, Qin, Jiaxin, Tang, Tianyi, Zhao, Wayne Xin

Interpreting ancient Chinese has been the key to comprehending vast Chinese literature, tradition, and civilization. In this paper, we propose Erya for ancient Chinese translation. From a dataset perspective, we collect, clean, and classify ancient Chinese materials from various sources, forming the most extensive ancient Chinese resource to date. From a model perspective, we devise Erya training method oriented towards ancient Chinese. We design two jointly-working tasks: disyllabic aligned substitution (DAS) and dual masked language model (DMLM). From an evaluation perspective, we build a benchmark to judge ancient Chinese translation quality in different scenarios and evaluate the ancient Chinese translation capacities of various existing models. Our model exhibits remarkable zero-shot performance across five domains, with over +12.0 BLEU against GPT-3.5 models and better human evaluation results than ERNIE Bot. Subsequent fine-tuning further shows the superior transfer capability of Erya model with +6.2 BLEU gain. We release all the above-mentioned resources at https://github.com/RUCAIBox/Erya.

large language model, machine learning, translation, (16 more...)

2308.0024

Country:

Asia > China > Beijing > Beijing (0.05)
Asia > Mongolia (0.04)
Asia > China > Inner Mongolia (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Comini, Giulia, Ribeiro, Manuel Sam, Yang, Fan, Shim, Heereen, Lorenzo-Trueba, Jaime

Multilingual context-based pronunciation learning for Text-to-Speech

Phonetic information and linguistic knowledge are an essential component of a Text-to-speech (TTS) front-end. Given a language, a lexicon can be collected offline and Grapheme-to-Phoneme (G2P) relationships are usually modeled in order to predict the pronunciation for out-of-vocabulary (OOV) words. Additionally, post-lexical phonology, often defined in the form of rule-based systems, is used to correct pronunciation within or between words. In this work we showcase a multilingual unified front-end system that addresses any pronunciation related task, typically handled by separate modules. We evaluate the proposed model on G2P conversion and other language-specific challenges, such as homograph and polyphones disambiguation, post-lexical rules and implicit diacritization. We find that the multilingual model is competitive across languages and tasks, however, some trade-offs exists when compared to equivalent monolingual solutions.

machine learning, natural language, pronunciation, (21 more...)

2307.16709

Country: Asia (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Synthesis (0.72)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Abbaszade, Mina, Zomorodi, Mariam, Salari, Vahid, Kurian, Philip

Toward Quantum Machine Translation of Syntactically Distinct Languages

The present study aims to explore the feasibility of language translation using quantum natural language processing algorithms on noisy intermediate-scale quantum (NISQ) devices. Classical methods in natural language processing (NLP) struggle with handling large-scale computations required for complex language tasks, but quantum NLP on NISQ devices holds promise in harnessing quantum parallelism and entanglement to efficiently process and analyze vast amounts of linguistic data, potentially revolutionizing NLP applications. Our research endeavors to pave the way for quantum neural machine translation, which could potentially offer advantages over classical methods in the future. We employ Shannon entropy to demonstrate the significant role of some appropriate angles of rotation gates in the performance of parametrized quantum circuits. In particular, we utilize these angles (parameters) as a means of communication between quantum circuits of different languages. To achieve our objective, we adopt the encoder-decoder model of classical neural networks and implement the translation task using long short-term memory (LSTM). Our experiments involved 160 samples comprising English sentences and their Persian translations. We trained the models with different optimisers implementing stochastic gradient descent (SGD) as primary and subsequently incorporating two additional optimizers in conjunction with SGD. Notably, we achieved optimal results-with mean absolute error of 0.03, mean squared error of 0.002, and 0.016 loss-by training the best model, consisting of two LSTM layers and using the Adam optimiser. Our small dataset, though consisting of simple synonymous sentences with word-to-word mappings, points to the utility of Shannon entropy as a figure of merit in more complex machine translation models for intricate sentence structures.

artificial intelligence, machine learning, natural language, (15 more...)

2307.16576

Country:

North America > Canada > Alberta > Census Division No. 6 > Calgary Metropolitan Region > Calgary (0.14)
Europe > Poland > Lesser Poland Province > Kraków (0.14)
North America > United States > District of Columbia > Washington (0.04)
(2 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

SelfSeg: A Self-supervised Sub-word Segmentation Method for Neural Machine Translation

Song, Haiyue, Dabre, Raj, Chu, Chenhui, Kurohashi, Sadao, Sumita, Eiichiro

Sub-word segmentation is an essential pre-processing step for Neural Machine Translation (NMT). Existing work has shown that neural sub-word segmenters are better than Byte-Pair Encoding (BPE), however, they are inefficient as they require parallel corpora, days to train and hours to decode. This paper introduces SelfSeg, a self-supervised neural sub-word segmentation method that is much faster to train/decode and requires only monolingual dictionaries instead of parallel corpora. SelfSeg takes as input a word in the form of a partially masked character sequence, optimizes the word generation probability and generates the segmentation with the maximum posterior probability, which is calculated using a dynamic programming algorithm. The training time of SelfSeg depends on word frequencies, and we explore several word frequency normalization strategies to accelerate the training phase. Additionally, we propose a regularization mechanism that allows the segmenter to generate various segmentations for one word. To show the effectiveness of our approach, we conduct MT experiments in low-, middle- and high-resource scenarios, where we compare the performance of using different segmentation methods. The experimental results demonstrate that on the low-resource ALT dataset, our method achieves more than 1.2 BLEU score improvement compared with BPE and SentencePiece, and a 1.1 score improvement over Dynamic Programming Encoding (DPE) and Vocabulary Learning via Optimal Transport (VOLT) on average. The regularization method achieves approximately a 4.3 BLEU score improvement over BPE and a 1.2 BLEU score improvement over BPE-dropout, the regularized version of BPE. We also observed significant improvements on IWSLT15 Vi->En, WMT16 Ro->En and WMT15 Fi->En datasets, and competitive results on the WMT14 De->En and WMT14 Fr->En datasets.

artificial intelligence, natural language, segmentation, (14 more...)

doi: 10.1145/3610611

2307.164

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Oceania > Australia > New South Wales > Sydney (0.14)
Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.05)
(17 more...)

Genre: Research Report > New Finding (0.66)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

How to DP-fy ML: A Practical Guide to Machine Learning with Differential Privacy

Ponomareva, Natalia, Hazimeh, Hussein, Kurakin, Alex, Xu, Zheng, Denison, Carson, McMahan, H. Brendan, Vassilvitskii, Sergei, Chien, Steve, Thakurta, Abhradeep

ML models are ubiquitous in real world applications and are a constant focus of research. At the same time, the community has started to realize the importance of protecting the privacy of ML training data. Differential Privacy (DP) has become a gold standard for making formal statements about data anonymization. However, while some adoption of DP has happened in industry, attempts to apply DP to real world complex ML models are still few and far between. The adoption of DP is hindered by limited practical guidance of what DP protection entails, what privacy guarantees to aim for, and the difficulty of achieving good privacy-utility-computation trade-offs for ML models. Tricks for tuning and maximizing performance are scattered among papers or stored in the heads of practitioners. Furthermore, the literature seems to present conflicting evidence on how and whether to apply architectural adjustments and which components are "safe" to use with DP. This work is a self-contained guide that gives an in-depth overview of the field of DP ML and presents information about achieving the best possible DP ML model with rigorous privacy guarantees. Our target audience is both researchers and practitioners. Researchers interested in DP for ML will benefit from a clear overview of current advances and areas for improvement. We include theory-focused sections that highlight important topics such as privacy accounting and its assumptions, and convergence. For a practitioner, we provide a background in DP theory and a clear step-by-step guide for choosing an appropriate privacy definition and approach, implementing DP training, potentially updating the model architecture, and tuning hyperparameters. For both researchers and practitioners, consistently and fully reporting privacy guarantees is critical, and so we propose a set of specific best practices for stating guarantees.

large language model, machine learning, natural language, (19 more...)

doi: 10.1613/jair.1.14649

2303.00654

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(2 more...)

Genre:

Workflow (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
(2 more...)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (0.92)
Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.92)
(5 more...)

arXiv.org Artificial IntelligenceJul-30-2023

XDLM: Cross-lingual Diffusion Language Model for Machine Translation

Chen, Linyao, Feng, Aosong, Yang, Boming, Li, Zihui

Recently, diffusion models have excelled in image generation tasks and have also been applied to neural language processing (NLP) for controllable text generation. However, the application of diffusion models in a cross-lingual setting is less unexplored. Additionally, while pretraining with diffusion models has been studied within a single language, the potential of cross-lingual pretraining remains understudied. To address these gaps, we propose XDLM, a novel Cross-lingual diffusion model for machine translation, consisting of pretraining and fine-tuning stages. In the pretraining stage, we propose TLDM, a new training objective for mastering the mapping between different languages; in the fine-tuning stage, we build up the translation system based on the pretrained model. We evaluate the result on several machine translation benchmarks and outperformed both diffusion and Transformer baselines.

diffusion model, machine learning, natural language, (14 more...)

2307.1356

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.05)
North America > United States > Maryland > Baltimore (0.04)
North America > United States > California (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Ogunremi, Tolulope, Tubosun, Kola, Aremu, Anuoluwapo, Orife, Iroro, Adelani, David Ifeoluwa

\`{I}r\`{o}y\`{i}nSpeech: A multi-purpose Yor\`{u}b\'{a} Speech Corpus

arXiv.org Artificial IntelligenceJul-29-2023

We introduce the \`{I}r\`{o}y\`{i}nSpeech corpus -- a new dataset influenced by a desire to increase the amount of high quality, freely available, contemporary Yor\`{u}b\'{a} speech. We release a multi-purpose dataset that can be used for both TTS and ASR tasks. We curated text sentences from the news and creative writing domains under an open license i.e., CC-BY-4.0 and had multiple speakers record each sentence. We provide 5000 of our utterances to the Common Voice platform to crowdsource transcriptions online. The dataset has 38.5 hours of data in total, recorded by 80 volunteers.

artificial intelligence, natural language, utterance, (17 more...)

2307.16071

Country:

North America > United States (0.14)
Africa > Niger (0.05)
South America > Brazil (0.04)
(12 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Communications > Social Media > Crowdsourcing (0.34)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.30)

Kolar, Sanjana, Kumar, Rohit

Multilingual Tourist Assistance using ChatGPT: Comparing Capabilities in Hindi, Telugu, and Kannada

arXiv.org Artificial IntelligenceJul-28-2023

This research investigates the effectiveness of ChatGPT, an AI language model by OpenAI, in translating English into Hindi, Telugu, and Kannada languages, aimed at assisting tourists in India's linguistically diverse environment. To measure the translation quality, a test set of 50 questions from diverse fields such as general knowledge, food, and travel was used. These were assessed by five volunteers for accuracy and fluency, and the scores were subsequently converted into a BLEU score. The BLEU score evaluates the closeness of a machine-generated translation to a human translation, with a higher score indicating better translation quality. The Hindi translations outperformed others, showcasing superior accuracy and fluency, whereas Telugu translations lagged behind. Human evaluators rated both the accuracy and fluency of translations, offering a comprehensive perspective on the language model's performance.

chatgpt, fluency, translation, (15 more...)

2307.15376

Country:

Asia > India (0.35)
North America > United States > California (0.05)
North America > United States > Texas (0.04)
North America > United States > New Jersey (0.04)

Genre: Research Report (0.82)

Industry: Consumer Products & Services (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.49)

Tran, Khiem Vinh, Van Nguyen, Kiet, Nguyen, Ngan Luu Thuy

BARTPhoBEiT: Pre-trained Sequence-to-Sequence and Image Transformers Models for Vietnamese Visual Question Answering

arXiv.org Artificial IntelligenceJul-28-2023

Visual Question Answering (VQA) is an intricate and demanding task that integrates natural language processing (NLP) and computer vision (CV), capturing the interest of researchers. The English language, renowned for its wealth of resources, has witnessed notable advancements in both datasets and models designed for VQA. However, there is a lack of models that target specific countries such as Vietnam. To address this limitation, we introduce a transformer-based Vietnamese model named BARTPhoBEiT. This model includes pre-trained Sequence-to-Sequence and bidirectional encoder representation from Image Transformers in Vietnamese and evaluates Vietnamese VQA datasets. Experimental results demonstrate that our proposed model outperforms the strong baseline and improves the state-of-the-art in six metrics: Accuracy, Precision, Recall, F1-score, WUPS 0.0, and WUPS 0.9.

computational linguistic, machine learning, natural language, (21 more...)

2307.15335

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Vietnam > Hồ Chí Minh City > Hồ Chí Minh City (0.05)
North America > United States > Arizona > Maricopa County > Scottsdale (0.04)
(5 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.93)