Goto

Collaborating Authors

 Machine Translation


Researchers Work to Make Artificial Intelligence Genuinely Fair

#artificialintelligence

Out of 11 proposals that were accepted this year by the NSF Program on Fairness in Artificial Intelligence in Collaboration with Amazon, two are led by UMD faculty. The program's goals are to increase accountability and transparency in AI algorithms and make them more accessible so that the benefits of AI are available to everyone. This includes machine learning algorithms--a subset of AI in which computerized systems are "trained" on large datasets to allow them to make proper decisions. Machine learning is used by some colleges around the country to rank applications for admittance to graduate school or allocate resources for faculty mentoring, teaching assistantships or coveted graduate fellowships. "As these AI-based systems are increasingly used in higher education, we want to make sure they render representations that are accurate and fair, which will require developing models that are free of both human and machine biases," said Furong Huang, an assistant professor of computer science who is leading one of the UMD teams.


How to Bring Machine Learning Models Into Production

#artificialintelligence

Eugene Rudenko, the AI Solution Consultant co-authored this Machine Learning development article with Vitaliy, Data Scientist at NIX United. The piece reveals a three-pronged method to putting ML models into production, as well as a commercial perspective.


Machine Learning Communities: Q1 '22 highlights and achievements

#artificialintelligence

Let's explore highlights and accomplishments of vast Google Machine Learning communities over the first quarter of the year! We are enthusiastic and grateful about all the activities that the communities across the globe do. ML Olympiad is an associated Kaggle Community Competitions hosted by Machine Learning Google Developers Experts (ML GDEs) or TensorFlow User Groups (TFUGs) sponsored by Google. The first round was hosted from January to March, suggesting solving critical problems of our time. Competition highlights include Autism Prediction Challenge, Arabic_Poems, Hausa Sentiment Analysis, Quality Education, Good Health and Well Being.


Lilt raises $55M to bolster its AI translation platform – TechCrunch

#artificialintelligence

Lilt, a provider of AI-powered business translation software, today announced that it raised $55 million in a Series C round led by Four Rivers, joined by new investors Sorenson Capital, CLEAR Ventures and Wipro Ventures. The company says that it plans to use the capital to expand its R&D efforts as well as its customer footprint and engineering teams. "Lilt [aims to] build a solution that [will] combine the best of human ingenuity with machine efficiency," CEO Spence Green told TechCrunch via email. "This new funding will … [reduce our] unit economics [to make] translation more affordable for all businesses. It will also [enable us to add] a sales team to our existing production team in Asia. We are in three regions -- the U.S., Europe, the Middle East and Africa (EMEA) and Asia -- and look to have both sales and production teams in each of these regions."


Waibel Elected a Fellow of the International Speech Communication Association

CMU School of Computer Science

Alex Waibel, a professor in Carnegie Mellon University's Language Technologies Institute, has been elected a fellow of the International Speech Communication Association (ISCA). The ISCA recognized Waibel for his pioneering contributions in multilingual and multimodal spoken language processing and translation. Waibel, also faculty at the Karlsruhe Institute of Technology in Germany, has worked on speech and machine translation for decades, developing systems that now can translate speech in real time. Waibel demonstrated the first speech translation systems in the 1990s and 2000s. By 2020, he had developed a system that outperformed humans in recognizing conversational speech on a public benchmark.



Towards Better Chinese-centric Neural Machine Translation for Low-resource Languages

arXiv.org Artificial Intelligence

The last decade has witnessed enormous improvements in science and technology, stimulating the growing demand for economic and cultural exchanges in various countries. Building a neural machine translation (NMT) system has become an urgent trend, especially in the low-resource setting. However, recent work tends to study NMT systems for low-resource languages centered on English, while few works focus on low-resource NMT systems centered on other languages such as Chinese. To achieve this, the low-resource multilingual translation challenge of the 2021 iFL YTEK AI Developer Competition provides the Chinese-centric multilingual low-resource NMT tasks, where participants are required to build NMT systems based on the provided low-resource samples. In this paper, we present the winner competition system that leverages monolingual word embeddings data enhancement, bilingual curriculum learning, and contrastive re-ranking. In addition, a new Incomplete-Trust (In-trust) loss function is proposed to replace the traditional cross-entropy loss when training. The experimental results demonstrate that the implementation of these ideas leads better performance than other state-of-the-art methods. All the experimental codes are released at: https://github.com/WENGSYX/


Spinning Language Models: Risks of Propaganda-As-A-Service and Countermeasures

arXiv.org Artificial Intelligence

We investigate a new threat to neural sequence-to-sequence (seq2seq) models: training-time attacks that cause models to "spin" their outputs so as to support an adversary-chosen sentiment or point of view -- but only when the input contains adversary-chosen trigger words. For example, a spinned summarization model outputs positive summaries of any text that mentions the name of some individual or organization. Model spinning introduces a "meta-backdoor" into a model. Whereas conventional backdoors cause models to produce incorrect outputs on inputs with the trigger, outputs of spinned models preserve context and maintain standard accuracy metrics, yet also satisfy a meta-task chosen by the adversary. Model spinning enables propaganda-as-a-service, where propaganda is defined as biased speech. An adversary can create customized language models that produce desired spins for chosen triggers, then deploy these models to generate disinformation (a platform attack), or else inject them into ML training pipelines (a supply-chain attack), transferring malicious functionality to downstream models trained by victims. To demonstrate the feasibility of model spinning, we develop a new backdooring technique. It stacks an adversarial meta-task onto a seq2seq model, backpropagates the desired meta-task output to points in the word-embedding space we call "pseudo-words," and uses pseudo-words to shift the entire output distribution of the seq2seq model. We evaluate this attack on language generation, summarization, and translation models with different triggers and meta-tasks such as sentiment, toxicity, and entailment. Spinned models largely maintain their accuracy metrics (ROUGE and BLEU) while shifting their outputs to satisfy the adversary's meta-task. We also show that, in the case of a supply-chain attack, the spin functionality transfers to downstream models.


Neural Natural Language Generation: A Survey on Multilinguality, Multimodality, Controllability and Learning

Journal of Artificial Intelligence Research

Developing artificial learning systems that can understand and generate natural language has been one of the long-standing goals of artificial intelligence. Recent decades have witnessed an impressive progress on both of these problems, giving rise to a new family of approaches. Especially, the advances in deep learning over the past couple of years have led to neural approaches to natural language generation (NLG). These methods combine generative language learning techniques with neural-networks based frameworks. With a wide range of applications in natural language processing, neural NLG (NNLG) is a new and fast growing field of research. In this state-of-the-art report, we investigate the recent developments and applications of NNLG in its full extent from a multidimensional view, covering critical perspectives such as multimodality, multilinguality, controllability and learning strategies. We summarize the fundamental building blocks of NNLG approaches from these aspects and provide detailed reviews of commonly used preprocessing steps and basic neural architectures. This report also focuses on the seminal applications of these NNLG models such as machine translation, description generation, automatic speech recognition, abstractive summarization, text simplification, question answering and generation, and dialogue generation. Finally, we conclude with a thorough discussion of the described frameworks by pointing out some open research directions.


Why a Cognitive AI Engine Is the Next Step in Accessibility and Inclusion

#artificialintelligence

To foster the next level of accessibility and inclusion, it's time to start investing our efforts into developing more sophisticated cognitive AI machines. Developing more sophisticated forms of cognitive AI is the key to expanding global accessibility and broadening the scope of inclusion. In fact, we already see unprecedented language coverage. Flint Capital notes that recent research shows the number of machine translation language pairs has soared from 16,000 to about 100,000 in a single year. On top of this, Flint Capital also notes that the global cognitive computing market is projected to surge to $72.26 billion by 2027. We already see huge gains with the rapid development of new AI tech that pushes the existing limits of voice synthesis and speech recognition.