AITopics

2211.15464

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
(12 more...)

Genre: Overview (1.00)

Industry: Education > Curriculum > Subject-Specific Education (0.90)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

arXiv.org Artificial IntelligenceNov-28-2022

Beyond Counting Datasets: A Survey of Multilingual Dataset Construction and Necessary Resources

Yu, Xinyan Velocity, Asai, Akari, Chatterjee, Trina, Hu, Junjie, Choi, Eunsol

While the NLP community is generally aware of resource disparities among languages, we lack research that quantifies the extent and types of such disparity. Prior surveys estimating the availability of resources based on the number of datasets can be misleading as dataset quality varies: many datasets are automatically induced or translated from English data. To provide a more comprehensive picture of language resources, we examine the characteristics of 156 publicly available NLP datasets. We manually annotate how they are created, including input text and label sources and tools used to build them, and what they study, tasks they address and motivations for their creation. After quantifying the qualitative NLP resource gap across languages, we discuss how to improve data collection in low-resource languages. We survey language-proficient NLP researchers and crowd workers per language, finding that their estimated availability correlates with dataset availability. Through crowdsourcing experiments, we identify strategies for collecting high-quality multilingual data on the Mechanical Turk platform. We conclude by making macro and micro-level suggestions to the NLP community and individual researchers for future multilingual data development.

artificial intelligence, machine learning, natural language, (20 more...)

2211.15649

Country:

Africa (0.14)
North America > United States > District of Columbia > Washington (0.05)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(22 more...)

Genre: Research Report > New Finding (0.46)

Industry: Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications > Social Media > Crowdsourcing (0.89)

Sosnowski, Witold, Seweryn, Karolina, Wróblewska, Anna, Gawrysiak, Piotr

Distance Metric Learning Loss Functions in Few-Shot Scenarios of Supervised Language Models Fine-Tuning

arXiv.org Artificial IntelligenceNov-28-2022

This paper presents an analysis regarding an influence of the Distance Metric Learning (DML) loss functions on the supervised fine-tuning of the language models for classification tasks. We experimented with known datasets from SentEval Transfer Tasks. Our experiments show that applying the DML loss function can increase performance on downstream classification tasks of RoBERTa-large models in few-shot scenarios. Models fine-tuned with the use of SoftTriple loss can achieve better results than models with a standard categorical cross-entropy loss function by about 2.89 percentage points from 0.04 to 13.48 percentage points depending on the training dataset. Additionally, we accomplished a comprehensive analysis with explainability techniques to assess the models' reliability and explain their results.

artificial intelligence, machine learning, natural language, (15 more...)

2211.15195

Country: Europe > Poland (0.15)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.46)

Baghaei, Kourosh T., Payandeh, Amirreza, Fayyazsanavi, Pooya, Rahimi, Shahram, Chen, Zhiqian, Ramezani, Somayeh Bakhtiari

Deep representation learning: Fundamentals, Perspectives, Applications, and Open Challenges

Machine Learning algorithms have had a profound impact on the field of computer science over the past few decades. These algorithms performance is greatly influenced by the representations that are derived from the data in the learning process. The representations learned in a successful learning process should be concise, discrete, meaningful, and able to be applied across a variety of tasks. A recent effort has been directed toward developing Deep Learning models, which have proven to be particularly effective at capturing high-dimensional, non-linear, and multi-modal characteristics. In this work, we discuss the principles and developments that have been made in the process of learning representations, and converting them into desirable applications. In addition, for each framework or model, the key issues and open challenges, as well as the advantages, are examined.

large language model, machine learning, natural language, (19 more...)

2211.14732

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > New York > New York County > New York City (0.04)
Asia > Middle East > Jordan (0.04)
(24 more...)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.67)

Industry:

Information Technology (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(6 more...)

Laddagiri, Bhavesh, Raj, Yash, Dash, Anshuman

EPIK: Eliminating multi-model Pipelines with Knowledge-distillation

Real-world tasks are largely composed of multiple models, each performing a sub-task in a larger chain of tasks, i.e., using the output from a model as input for another model in a multi-model pipeline. A model like MATRa performs the task of Crosslingual Transliteration in two stages, using English as an intermediate transliteration target when transliterating between two indic languages. We propose a novel distillation technique, EPIK, that condenses two-stage pipelines for hierarchical tasks into a single end-to-end model without compromising performance. This method can create end-to-end models for tasks without needing a dedicated end-to-end dataset, solving the data scarcity problem. The EPIK model has been distilled from the MATra model using this technique of knowledge distillation. The MATra model can perform crosslingual transliteration between 5 languages - English, Hindi, Tamil, Kannada and Bengali. The EPIK model executes the task of transliteration without any intermediate English output while retaining the performance and accuracy of the MATra model. The EPIK model can perform transliteration with an average CER score of 0.015 and average phonetic accuracy of 92.1%. In addition, the average time for execution has reduced by 54.3% as compared to the teacher model and has a similarity score of 97.5% with the teacher encoder. In a few cases, the EPIK model (student model) can outperform the MATra model (teacher model) even though it has been distilled from the MATra model.

artificial intelligence, machine learning, natural language, (19 more...)

2211.1492

Country: Asia > India > Chandigarh (0.04)

Genre: Research Report (1.00)

Industry: Education (0.55)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

BJTU-WeChat's Systems for the WMT22 Chat Translation Task

Liang, Yunlong, Meng, Fandong, Xu, Jinan, Chen, Yufeng, Zhou, Jie

This paper introduces the joint submission of the Beijing Jiaotong University and WeChat AI to the WMT'22 chat translation task for English-German. Based on the Transformer, we apply several effective variants. In our experiments, we utilize the pre-training-then-fine-tuning paradigm. In the first pre-training stage, we employ data filtering and synthetic data generation (i.e., back-translation, forward-translation, and knowledge distillation). In the second fine-tuning stage, we investigate speaker-aware in-domain data generation, speaker adaptation, prompt-based context modeling, target denoising fine-tuning, and boosted self-COMET-based model ensemble. Our systems achieve 0.810 and 0.946 COMET scores. The COMET scores of English-German and German-English are the highest among all submissions.

artificial intelligence, machine learning, natural language, (18 more...)

2211.15009

Country:

Asia > China > Beijing > Beijing (0.25)
Oceania > Australia > Victoria > Melbourne (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(5 more...)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology > Services (0.61)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Summer: WeChat Neural Machine Translation Systems for the WMT22 Biomedical Translation Task

Li, Ernan, Meng, Fandong, Zhou, Jie

This paper introduces WeChat's participation in WMT 2022 shared biomedical translation task on Chinese to English. Our systems are based on the Transformer, and use several different Transformer structures to improve the quality of translation. In our experiments, we employ data filtering, data generation, several variants of Transformer, fine-tuning and model ensemble. Our Chinese$\to$English system, named Summer, achieves the highest BLEU score among all submissions.

artificial intelligence, natural language, translation, (15 more...)

2211.15022

Country:

Europe > Germany > Berlin (0.05)
Oceania > Australia (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
(3 more...)

Genre: Research Report (0.70)

Industry: Information Technology > Services (0.62)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Hsu, Wei-Ning, Shi, Bowen

u-HuBERT: Unified Mixed-Modal Speech Pretraining And Zero-Shot Transfer to Unlabeled Modality

While audio-visual speech models can yield superior performance and robustness compared to audio-only models, their development and adoption are hindered by the lack of labeled and unlabeled audio-visual data and the cost to deploy one model per modality. In this paper, we present u-HuBERT, a self-supervised pre-training framework that can leverage both multimodal and unimodal speech with a unified masked cluster prediction objective. By utilizing modality dropout during pre-training, we demonstrate that a single fine-tuned model can achieve performance on par or better than the state-of-the-art modality-specific models. Moreover, our model fine-tuned only on audio can perform well with audio-visual and visual speech input, achieving zero-shot modality generalization for multiple speech processing tasks.

artificial intelligence, machine learning, natural language, (16 more...)

2207.07036

Country:

Europe > Portugal > Braga > Braga (0.04)
North America > United States (0.04)
Europe > Germany (0.04)
Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

arXiv.org Artificial IntelligenceNov-26-2022

Lexical Complexity Controlled Sentence Generation

Nie, Jinran, Yang, Liner, Chen, Yun, Kong, Cunliang, Zhu, Junhui, Yang, Erhong

Text generation rarely considers the control of lexical complexity, which limits its more comprehensive practical application. We introduce a novel task of lexical complexity controlled sentence generation, which aims at keywords to sentence generation with desired complexity levels. It has enormous potential in domains such as grade reading, language teaching and acquisition. The challenge of this task is to generate fluent sentences only using the words of given complexity levels. We propose a simple but effective approach for this task based on complexity embedding. Compared with potential solutions, our approach fuses the representations of the word complexity levels into the model to get better control of lexical complexity. And we demonstrate the feasibility of the approach for both training models from scratch and fine-tuning the pre-trained models. To facilitate the research, we develop two datasets in English and Chinese respectively, on which extensive experiments are conducted. Results show that our approach better controls lexical complexity and generates higher quality sentences than baseline methods.

artificial intelligence, machine learning, natural language, (18 more...)

2211.1454

Country:

Europe > United Kingdom > England > Leicestershire > Leicester (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Asia > Middle East > Jordan (0.04)
(2 more...)

Genre:

Research Report > New Finding (0.34)
Research Report > Promising Solution (0.34)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

#artificialintelligenceNov-25-2022, 00:48:04 GMT

Inferencing the Transformer Model - MachineLearningMastery.com Inferencing the Transformer Model - MachineLearningMastery.com

We have seen how to train the Transformer model on a dataset of English and German sentence pairs and how to plot the training and validation loss curves to diagnose the model's learning performance and decide at which epoch to run inference on the trained model. We are now ready to run inference on the trained Transformer model to translate an input sentence. In this tutorial, you will discover how to run inference on the trained Transformer model for neural machine translation. It provides self-study tutorials with working code to guide you into building a fully-working transformer model that can translate sentences from one language to another... Inferencing the Transformer model Photo by Karsten Würth, some rights reserved. Recall having seen that the Transformer architecture follows an encoder-decoder structure.

dataset, transformer model, translation, (12 more...)

#artificialintelligence

Genre: Instructional Material > Course Syllabus & Notes (0.35)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.94)