AITopics

Country:

North America > United States > California (0.24)
Europe > Switzerland (0.24)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)

Genre: Personal (0.68)

Industry:

Information Technology (1.00)
Education > Educational Setting > Online (1.00)
Banking & Finance (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.70)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.69)

Yin, Kayo, Fernandes, Patrick, Pruthi, Danish, Chaudhary, Aditi, Martins, André F. T., Neubig, Graham

Do Context-Aware Translation Models Pay the Right Attention?

arXiv.org Artificial IntelligenceMay-21-2021

Context-aware machine translation models are designed to leverage contextual information, but often fail to do so. As a result, they inaccurately disambiguate pronouns and polysemous words that require context for resolution. In this paper, we ask several questions: What contexts do human translators use to resolve ambiguous words? Are models paying large amounts of attention to the same context? What if we explicitly train them to do so? To answer these questions, we introduce SCAT (Supporting Context for Ambiguous Translations), a new English-French dataset comprising supporting context words for 14K translations that professional translators found useful for pronoun disambiguation. Using SCAT, we perform an in-depth analysis of the context used to disambiguate, examining positional and lexical characteristics of the supporting words. Furthermore, we measure the degree of alignment between the model's attention scores and the supporting context from SCAT, and apply a guided attention strategy to encourage agreement between the two.

computational linguistic, proceedings, translation, (14 more...)

2105.06977

Country:

Europe > Portugal > Lisbon > Lisbon (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
(17 more...)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

#artificialintelligenceMay-19-2021, 19:10:17 GMT

Neural Machine Translation using a Seq2Seq Architecture and Attention (ENG to POR)

Neural Machine Translation (NMT) is an end-to-end learning approach for automated translation [1]. Its strength comes from the fact that it learns the mapping directly from input text to associated output text. It has been proven to be more effective than traditional phrase-based machine translation, which requires much more effort to design the model. On the other hand, NMT models are costly to train, especially on large-scale translation datasets. They are also significantly slower at inference time due to the large number of parameters used.

cell state, rnn, translation, (12 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

#artificialintelligenceMay-16-2021, 18:15:57 GMT

Can Artificial Intelligence Create its Own Language?

Back in 2017, the media was frenzying over Facebook's decision to scrap one of its artificial intelligence engines, which was said to have created its own language that could not be understood by humans. AI can sometimes be quirky and suspicious. In the current world technology is not a luxury, but a necessity. AI is one of the most popular of them, which has successfully aided several innovations across all industries. There have been many speculations around this disruptive technology and it has often been looked upon with fear.

artificial intelligence create, negotiation, own language, (11 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.31)

The Japan TimesMay-14-2021, 03:40:08 GMT

Fujitsu releases hands-free speech translation service

Fujitsu Ltd. on Thursday released a multilingual speech translation service that does not require users to operate devices by hand. The service is designed for settings in which multilingual communication is needed amid a rise in the domestic population of non-Japanese speakers, such as medical facilities. It automatically translates speech after identifying the voices and locations of users on the basis of sound picked up by directional microphones connected to tablet devices. Fujitsu said that the voice recognition is highly accurate thanks to technology limiting the effects of background noise. In addition to medical settings, the service is expected to be used at tourist sites.

artificial intelligence, machine translation, natural language, (6 more...)

The Japan Times

Country: Asia > Myanmar (0.08)

Industry: Law > Civil Rights & Constitutional Law (0.40)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)

Ryabinin, Max, Malinin, Andrey, Gales, Mark

Scaling Ensemble Distribution Distillation to Many Classes with Proxy Targets

arXiv.org Artificial IntelligenceMay-14-2021

Ensembles of machine learning models yield improved system performance as well as robust and interpretable uncertainty estimates; however, their inference costs may often be prohibitively high. Ensemble Distribution Distillation is an approach that allows a single model to efficiently capture both the predictive performance and uncertainty estimates of an ensemble. For classification, this is achieved by training a Dirichlet distribution over the ensemble members' output distributions via the maximum likelihood criterion. Although theoretically principled, this criterion exhibits poor convergence when applied to large-scale tasks where the number of classes is very high. In our work, we analyze this effect and show that for the Dirichlet log-likelihood criterion classes with low probability induce larger gradients than high-probability classes. This forces the model to focus on the distribution of the ensemble tail-class probabilities. We propose a new training objective which minimizes the reverse KL-divergence to a Proxy-Dirichlet target derived from the ensemble. This loss resolves the gradient issues of Ensemble Distribution Distillation, as we demonstrate both theoretically and empirically on the ImageNet and WMT17 En-De datasets containing 1000 and 40,000 classes, respectively.

dirichlet distribution, ensemble, ensemble distribution distillation, (13 more...)

2105.06987

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
Asia > Russia (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

#artificialintelligenceMay-13-2021, 11:50:08 GMT

Investing in AI for Good (SSIR)

In the past 10 years, hundreds of projects have applied artificial intelligence (AI) to creating social good. The right tool applied to an appropriate problem has the potential to drastically improve millions of lives through better service delivery and better-informed policy design. But what kind of investments do AI solutions need to be successful, and which applications have the most potential for social impact? AI excels at helping humans harness large-scale or complex data to predict, categorize, or optimize at a scale and speed beyond human ability. We believe that more targeted, sustained investments in AI for social impact (sometimes called "AI for good")--rather than multiple, short-term grants across a variety of areas--are important for two reasons.

investment, training data, translation, (17 more...)

Country:

North America > United States (0.14)
Africa > Sub-Saharan Africa (0.04)

Industry:

Social Sector (1.00)
Health & Medicine > Therapeutic Area (1.00)
Government (1.00)
Food & Agriculture > Agriculture (0.97)

Technology:

Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.61)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.47)

arXiv.org Artificial IntelligenceMay-13-2021

Reliability Testing for Natural Language Processing Systems

Tan, Samson, Joty, Shafiq, Baxter, Kathy, Taeihagh, Araz, Bennett, Gregory A., Kan, Min-Yen

Questions of fairness, robustness, and transparency are paramount to address before deploying NLP systems. Central to these concerns is the question of reliability: Can NLP systems reliably treat different demographics fairly and function correctly in diverse and noisy environments? To address this, we argue for the need for reliability testing and contextualize it among existing work on improving accountability. We show how adversarial attacks can be reframed for this goal, via a framework for developing reliability tests. We argue that Figure 1: How DOCTOR can integrate with existing reliability testing -- with an emphasis on interdisciplinary system development workflows. Test (left) and system collaboration -- will enable rigorous development (right) take place in parallel, separate and targeted testing, and aid in the enactment teams. Reliability tests can thus be constructed independent and enforcement of industry standards. of the system development team, either by an internal "red team" or by independent auditors.

artificial intelligence, computational linguistic, natural language, (18 more...)

2105.0259

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Washington (0.14)
Asia > Singapore (0.05)
(15 more...)

Genre:

Research Report (1.00)
Overview (0.93)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine (0.93)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)

Lee, Gyubok, Yang, Seongjun, Choi, Edward

Improving Lexically Constrained Neural Machine Translation with Source-Conditioned Masked Span Prediction

arXiv.org Artificial IntelligenceMay-12-2021

Generating accurate terminology is a crucial component for the practicality and reliability of neural machine translation (NMT) systems. To address this, lexically constrained NMT explores various methods to ensure pre-specified words and phrases to appear in the translations. In many cases, however, those methods are evaluated on general domain corpora, where the terms are mostly uni- and bi-grams (>98%). In this paper, we instead tackle a more challenging setup consisting of domain-specific corpora with much longer n-gram and highly specialized terms. To encourage span-level representations in generation, we additionally impose a source-sentence conditioned masked span prediction loss in the decoder and observe improvements on both terminology translation as well as BLEU scores. Experimental results on three domain-specific corpora in two language pairs demonstrate that the proposed training scheme can improve the performance of existing lexically constrained methods that can operate both with or without a term dictionary at test time.

computational linguistic, proceedings, translation, (13 more...)

2105.05498

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Italy > Tuscany > Florence (0.04)
(14 more...)

Genre: Research Report (0.50)

Industry: Law (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

arXiv.org Artificial IntelligenceMay-11-2021

Including Signed Languages in Natural Language Processing

Yin, Kayo, Moryossef, Amit, Hochgesang, Julie, Goldberg, Yoav, Alikhani, Malihe

Signed languages are the primary means of communication for many deaf and hard of hearing individuals. Since signed languages exhibit all the fundamental linguistic properties of natural language, we believe that tools and theories of Natural Language Processing (NLP) are crucial towards its modeling. However, existing research in Sign Language Processing (SLP) seldom attempt to explore and leverage the linguistic organization of signed languages. This position paper calls on the NLP community to include signed languages as a research area with high social and scientific impact. We first discuss the linguistic properties of signed languages to consider during their modeling. Then, we review the limitations of current SLP models and identify the open challenges to extend NLP to signed languages. Finally, we urge (1) the adoption of an efficient tokenization method; (2) the development of linguistically-informed models; (3) the collection of real-world signed language data; (4) the inclusion of local signed language communities as an active and leading voice in the direction of research.

sign language, signed language, translation, (15 more...)

2105.05222

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
(10 more...)

Genre:

Research Report (0.50)
Overview (0.34)

Industry:

Education > Curriculum > Subject-Specific Education (0.46)
Health & Medicine > Therapeutic Area > Otolaryngology (0.34)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.94)