AITopics

A special purpose learning system assumes knowledge of admissible tasks at design time. Adapting such a system to unforeseen tasks requires architecture manipulation such as adding an output head for each new task or dataset. In this work, we propose a task-agnostic vision-language system that accepts an image and a natural language task description and outputs bounding boxes, confidences, and text. The system supports a wide range of vision tasks such as classification, localization, question answering, captioning, and more. We evaluate the system's ability to learn multiple skills simultaneously, to perform tasks with novel skill-concept combinations, and to learn new skills efficiently and without forgetting.

architecture, classification, gpv-i, (16 more...)

2104.00743

Country: North America > United States > Illinois (0.04)

Genre: Research Report (0.64)

Industry:

Transportation > Ground > Road (0.46)
Leisure & Entertainment > Sports (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.67)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.46)

Is Label Smoothing Truly Incompatible with Knowledge Distillation: An Empirical Study

Shen, Zhiqiang, Liu, Zechun, Xu, Dejia, Chen, Zitian, Cheng, Kwang-Ting, Savvides, Marios

This work aims to empirically clarify a recently discovered perspective that label smoothing is incompatible with knowledge distillation (Müller et al., 2019). We begin by introducing the motivation behind on how this incompatibility is raised, i.e., label smoothing erases relative information between teacher logits. We provide a novel connection on how label smoothing affects distributions of semantically similar and dissimilar classes. Then we propose a metric to quantitatively measure the degree of erased information in sample's representation. After that, we study its one-sidedness and imperfection of the incompatibility view through massive analyses, visualizations and comprehensive experiments on Image Classification, Binary Networks, and Neural Machine Translation. Finally, we broadly discuss several circumstances wherein label smoothing will indeed lose its effectiveness. Recently a large body of studies is focusing on exploring the underlying relationships between these two methods, for instance, Müller et al. (Müller et al., 2019) discovered that label smoothing could improve calibration implicitly but will hurt the effectiveness of knowledge distillation. Yuan et al. (Yuan et al., 2019) considered knowledge distillation as a dynamical form of label smoothing as it delivered a regularization effect in training. The recent study (Lukasik et al., 2020) further noticed label smoothing could help mitigate label noise, they showed that when distilling models from noisy data, the teacher with label smoothing is helpful.

knowledge distillation, student, teacher network, (13 more...)

2104.00676

Genre: Research Report > New Finding (0.93)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Azunre, Paul, Osei, Salomey, Addo, Salomey, Adu-Gyamfi, Lawrence Asamoah, Moore, Stephen, Adabankah, Bernard, Opoku, Bernard, Asare-Nyarko, Clara, Nyarko, Samuel, Amoaba, Cynthia, Appiah, Esther Dansoa, Akwerh, Felix, Lawson, Richard Nii Lante, Budu, Joel, Debrah, Emmanuel, Boateng, Nana, Ofori, Wisdom, Buabeng-Munkoh, Edwin, Adjei, Franklin, Ampomah, Isaac Kojo Essel, Otoo, Joseph, Borkor, Reindorf, Mensah, Standylove Birago, Mensah, Lucien, Marcel, Mark Amoako, Amponsah, Anokye Acheampong, Hayfron-Acquah, James Ben

English-Twi Parallel Corpus for Machine Translation

We present a parallel machine translation training corpus for English and Akuapem Twi of 25,421 sentence pairs. We used a transformer-based translator to generate initial translations in Akuapem Twi, which were later verified and corrected where necessary by native speakers to eliminate any occurrence of translationese. In addition, 697 higher quality crowd-sourced sentences are provided for use as an evaluation set for downstream Natural Language Processing (NLP) tasks. The typical use case for the larger human-verified dataset is for further training of machine translation models in Akuapem Twi. The higher quality 697 crowd-sourced dataset is recommended as a testing dataset for machine translation of English to Twi and Twi to English models. Furthermore, the Twi part of the crowd-sourced data may also be used for other tasks, such as representation learning, classification, etc. We fine-tune the transformer translation model on the training corpus and report benchmarks on the crowd-sourced test set.

akuapem twi, corpus, translation, (12 more...)

2103.15625

Country:

Europe > Finland > Uusimaa > Helsinki (0.05)
Europe > Norway (0.04)
Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
(7 more...)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

NLP for Ghanaian Languages

In the much-applauded interventions by Google The advancement in machine learning computational and Microsoft through their translation services, power coupled with the recent investment quite a number of African languages have been within the domain by technological companies integrated, but Ghanaian languages are excluded has stimulated considerable interest and (Google, 2020; Microsoft, 2021). A historic move brought about a legion of applications in natural worth mentioning is Baidu Translate's incorporation language digitisation in developed countries, of the Twi language in their translation service.

ghana, ghanaian language, nlp ghana, (16 more...)

2103.15475

Country:

Africa > Ghana > Greater Accra > Accra (0.05)
Africa > Togo (0.04)
Africa > Sub-Saharan Africa (0.04)
(4 more...)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

#artificialintelligenceMar-31-2021, 09:00:07 GMT

Online translators are sexist – here's how we gave them a little gender sensitivity training

Online translation tools have helped us learn new languages, communicate across linguistic borders, and view foreign websites in our native tongue. But the artificial intelligence (AI) behind them is far from perfect, often replicating rather than rejecting the biases that exist within a language or a society. Such tools are especially vulnerable to gender stereotyping, because some languages (such as English) don't tend to gender nouns, while others (such as German) do. When translating from English to German, translation tools have to decide which gender to assign English words like "cleaner". Overwhelmingly, the tools conform to the stereotype, opting for the feminine word in German.

engineer, gender, translation tool, (13 more...)

#artificialintelligence

Industry: Education (0.50)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Joshi, Aviral, Huang, Chengzhi, Singh, Har Simrat

Zero-Shot Language Transfer vs Iterative Back Translation for Unsupervised Machine Translation

arXiv.org Artificial IntelligenceMar-31-2021

This work focuses on comparing different solutions for machine translation on low resource language pairs, namely, with zero-shot transfer learning and unsupervised machine translation. We discuss how the data size affects the performance of both unsupervised MT and transfer learning. Additionally we also look at how the domain of the data affects the result of unsupervised MT. The code to all the experiments performed in this project are accessible on Github.

experiment, language pair, translation, (14 more...)

2104.00106

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > Slovenia (0.04)
Asia > Thailand > Phuket > Phuket (0.04)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceMar-31-2021

Contextual Text Embeddings for Twi

Transformer-based language models have been changing the modern Natural Language Processing (NLP) landscape for high-resource languages such as English, Chinese, Russian, etc. However, this technology does not yet exist for any Ghanaian language. In this paper, we introduce the first of such models for Twi or Akan, the most widely spoken Ghanaian language. The specific contribution of this research work is the development of several pretrained transformer language models for the Akuapem and Asante dialects of Twi, paving the way for advances in application areas such as Named Entity Recognition (NER), Neural Machine Translation (NMT), Sentiment Analysis (SA) and Part-of-Speech (POS) tagging. Specifically, we introduce four different flavours of ABENA -- A BERT model Now in Akan that is fine-tuned on a set of Akan corpora, and BAKO - BERT with Akan Knowledge only, which is trained from scratch. We open-source the model through the Hugging Face model hub and demonstrate its use via a simple sentiment classification example.

architecture, arxiv, language model, (15 more...)

2103.15963

Country:

Europe > Spain (0.04)
Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
Asia > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.90)

arXiv.org Artificial IntelligenceMar-29-2021

Unsupervised Machine Translation On Dravidian Languages

Koneru, Sai, Liu, Danni, Niehues, Jan

Unsupervised neural machine translation (UNMT) is beneficial especially for low resource languages such as those from the Dravidian family. However, UNMT systems tend to fail in realistic scenarios involving actual low resource languages. Recent works propose to utilize auxiliary parallel data and have achieved state-of-the-art results. In this work, we focus on unsupervised translation between English and Kannada, a low resource Dravidian language. We additionally utilize a limited amount of auxiliary data between English and other related Dravidian languages. We show that unifying the writing systems is essential in unsupervised translation between the Dravidian languages. We explore several model architectures that use the auxiliary data in order to maximize knowledge sharing and enable UNMT for distant language pairs. Our experiments demonstrate that it is crucial to include auxiliary languages that are similar to our focal language, Kannada. Furthermore, we propose a metric to measure language similarity and show that it serves as a good indicator for selecting the auxiliary languages.

artificial intelligence, natural language, translation, (15 more...)

2103.15877

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Europe > Germany > Berlin (0.04)
(10 more...)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

arXiv.org Artificial IntelligenceMar-29-2021

Platform for Situated Intelligence

Bohus, Dan, Andrist, Sean, Feniello, Ashley, Saw, Nick, Jalobeanu, Mihai, Sweeney, Patrick, Thompson, Anne Loomis, Horvitz, Eric

We introduce Platform for Situated Intelligence, an open-source framework created to support the rapid development and study of multimodal, integrative-AI systems. The framework provides infrastructure for sensing, fusing, and making inferences from temporal streams of data across different modalities, a set of tools that enable visualization and debugging, and an ecosystem of components that encapsulate a variety of perception and processing technologies. These assets jointly provide the means for rapidly constructing and refining multimodal, integrative-AI systems, while retaining the efficiency and performance characteristics required for deployment in open-world settings.

application, operator, pipeline, (16 more...)