Goto

Collaborating Authors

 Machine Translation


AbbVie Accelerates Natural Language Processing

#artificialintelligence

AbbVie is a research-based biopharmaceutical company that serves more than 30 million patients in 175 countries. With its global scale, AbbVie partnered with Intel to optimize processes for its more than 47,000 employees. This whitepaper highlights two use cases that are important to AbbVie's research. The first is Abbelfish Machine Translation, AbbVie's language translation service based on the Transformer NLP model, that leverages second-generation Intel Xeon Scalable processors and the Intel Optimization for TensorFlow with Intel oneAPI Deep Neural Network Library (oneDNN). AbbVie was able to achieve a 1.9x improvement in throughput for Abbelfish language translation using Intel Optimization for TensorFlow 1.15 with oneAPI Deep Neural Network Library when compared to TensorFlow 1.15 without oneDNN.1


LTI Value Cast: Reshaping Remote Work & Meetings with Linguistic AI

#artificialintelligence

The abrupt move to an almost exclusive home-based working environment at the start of the Covid19 crisis resulted in a new work ethic: back-to-back web calls from your living room or kitchen, across various web conferencing systems, and requiring to handle multilanguage interactions. The onsite in-person meetings were facilitated by the help of interpreters, traditional meeting notes redaction, and lengthy post-meeting analysis and review. In the new virtual environment, it is up to advanced language technologies powered by Artificial Intelligence to solve these issues. Speech to text, neural machine translation and hybrid natural language understanding will automate complex human tasks and replace the more repetitive processes, creating a „digital work companion" that can assist in the next fast-paced remote working environment challenges.


OmniNet: Omnidirectional Representations from Transformers

arXiv.org Artificial Intelligence

This paper proposes Omnidirectional Representations from Transformers (OmniNet). In OmniNet, instead of maintaining a strictly horizontal receptive field, each token is allowed to attend to all tokens in the entire network. This process can also be interpreted as a form of extreme or intensive attention mechanism that has the receptive field of the entire width and depth of the network. To this end, the omnidirectional attention is learned via a meta-learner, which is essentially another self-attention based model. In order to mitigate the computationally expensive costs of full receptive field attention, we leverage efficient self-attention models such as kernel-based (Choromanski et al.), low-rank attention (Wang et al.) and/or Big Bird (Zaheer et al.) as the meta-learner. Extensive experiments are conducted on autoregressive language modeling (LM1B, C4), Machine Translation, Long Range Arena (LRA), and Image Recognition. The experiments show that OmniNet achieves considerable improvements across these tasks, including achieving state-of-the-art performance on LM1B, WMT'14 En-De/En-Fr, and Long Range Arena. Moreover, using omnidirectional representation in Vision Transformers leads to significant improvements on image recognition tasks on both few-shot learning and fine-tuning setups.


AI Incident Database Spotlights Worst Machine Translation Fails

#artificialintelligence

In the ongoing popular (albeit shallow) debate pitting human translators against machine translation (MT), one constant is the question of quality -- how to define it, how to measure it, and how to improve it. Now, a new website, the AI Incident Database (AIID), aims to quantify the risks presented, and actual harm caused, by AI. Sean McGregor, ML architect at Syntiant and developer of the AIID, described the "collective memory of [AI systems'] failings" in a November 2020 paper. As McGregor explained, the AIID is a project of the Partnership on AI (PAI), an organization funded by tech companies and governed by a board comprising corporate partners and non-profits. The AIID is modeled on incident databases in other industries, namely aviation and cybersecurity, which promote transparency.


CURE: Code-Aware Neural Machine Translation for Automatic Program Repair

arXiv.org Artificial Intelligence

Automatic program repair (APR) is crucial to improve software reliability. Recently, neural machine translation (NMT) techniques have been used to fix software bugs automatically. While promising, these approaches have two major limitations. Their search space often does not contain the correct fix, and their search strategy ignores software knowledge such as strict code syntax. Due to these limitations, existing NMT-based techniques underperform the best template-based approaches. We propose CURE, a new NMT-based APR technique with three major novelties. First, CURE pre-trains a programming language (PL) model on a large software codebase to learn developer-like source code before the APR task. Second, CURE designs a new code-aware search strategy that finds more correct fixes by focusing on compilable patches and patches that are close in length to the buggy code. Finally, CURE uses a subword tokenization technique to generate a smaller search space that contains more correct fixes. Our evaluation on two widely-used benchmarks shows that CURE correctly fixes 57 Defects4J bugs and 26 QuixBugs bugs, outperforming all existing APR techniques on both benchmarks.


Even Small Companies Use AI, Machine Learning

#artificialintelligence

Data, technology, and people are at hand to make artificial intelligence and machine learning available to all commerce companies. To be certain, artificial intelligence and its sub-field, machine learning, have gone through cycles of inflated expectations followed by disappointments. For example, in the 1950s and 1960s, the United States government funded research for the machine translation of languages. The hope was that Russian-language documents could be instantly translated to English. But by 1966, a report from the Automatic Language Processing Advisory Committee, a government team of seven scientists, essentially killed machine translation research in the U.S. for about a decade.


The Transformation of Patient-Clinician Relationships with AI-based Medical Advice

Communications of the ACM

One of the dramatic trends at the intersection of computing and healthcare has been patients' increased access to medical information, ranging from self-tracked physiological data to genetic data, tests, and scans. Increasingly however, patients and clinicians have access to advanced machine learning-based tools for diagnosis, prediction, and recommendation based on large amounts of data, some of it patient-generated. Consequently, just as organizations have had to deal with a "Bring Your Own Device" (BYOD) reality5 in which employees use their personal devices (phones and tablets) for some aspects of their work, a similar reality of "Bring Your Own Algorithm" (BYOA) is emerging in healthcare with its own challenges and support demands. BYOA is changing patient-clinician interactions and the technologies, skills and workflows related to them. Situations in which patients have direct access to algorithmic advice are becoming commonplace.4


Pre-Training BERT on Arabic Tweets: Practical Considerations

arXiv.org Artificial Intelligence

Pretraining Bidirectional Encoder Representations from Transformers (BERT) for downstream NLP tasks is a non-trival task. We pretrained 5 BERT models that differ in the size of their training sets, mixture of formal and informal Arabic, and linguistic preprocessing. All are intended to support Arabic dialects and social media. The experiments highlight the centrality of data diversity and the efficacy of linguistically aware segmentation. They also highlight that more data or more training step do not necessitate better models. Our new models achieve new state-of-the-art results on several downstream tasks. The resulting models are released to the community under the name QARiB.


CDA: a Cost Efficient Content-based Multilingual Web Document Aligner

arXiv.org Artificial Intelligence

We introduce a Content-based Document Alignment approach (CDA), an efficient method to align multilingual web documents based on content in creating parallel training data for machine translation (MT) systems operating at the industrial level. CDA works in two steps: (i) projecting documents of a web domain to a shared multilingual space; then (ii) aligning them based on the similarity of their representations in such space. We leverage lexical translation models to build vector representations using TF-IDF. CDA achieves performance comparable with state-of-the-art systems in the WMT-16 Bilingual Document Alignment Shared Task benchmark while operating in multilingual space. Besides, we created two web-scale datasets to examine the robustness of CDA in an industrial setting involving up to 28 languages and millions of documents. The experiments show that CDA is robust, cost-effective, and is significantly superior in (i) processing large and noisy web data and (ii) scaling to new and low-resourced languages.


Sparsely Factored Neural Machine Translation

arXiv.org Artificial Intelligence

The standard approach to incorporate linguistic information to neural machine translation systems consists in maintaining separate vocabularies for each of the annotated features to be incorporated (e.g. POS tags, dependency relation label), embed them, and then aggregate them with each subword in the word they belong to. This approach, however, cannot easily accommodate annotation schemes that are not dense for every word. We propose a method suited for such a case, showing large improvements in out-of-domain data, and comparable quality for the in-domain data. Experiments are performed in morphologically-rich languages like Basque and German, for the case of low-resource scenarios.