Machine Translation
Customizing your machine translation using Amazon Translate Active Custom Translation
When translating the English phrase "How are you?" to Spanish, would you prefer to use "ยฟCรณmo estรกs?" or "ยฟCรณmo estรก usted?" instead? Amazon Translate is a neural machine translation service that delivers fast, high-quality, and affordable language translation. Today, we're excited to introduce Active Custom Translation (ACT), a feature that gives you more control over your machine translation output. You can now influence what machine translation output you would like to get between "ยฟCรณmo estรกs?" or "ยฟCรณmo estรก usted?". To make ACT work, simply provide your translation examples in TMX, TSV, or CSV format to create parallel data (PD), and Amazon Translate uses your PD along with your batch translation job to customize the translation output at runtime.
Facilitating the Communication of Politeness through Fine-Grained Paraphrasing
Fu, Liye, Fussell, Susan R., Danescu-Niculescu-Mizil, Cristian
Aided by technology, people are increasingly able to communicate across geographical, cultural, and language barriers. This ability also results in new challenges, as interlocutors need to adapt their communication approaches to increasingly diverse circumstances. In this work, we take the first steps towards automatically assisting people in adjusting their language to a specific communication circumstance. As a case study, we focus on facilitating the accurate transmission of pragmatic intentions and introduce a methodology for suggesting paraphrases that achieve the intended level of politeness under a given communication circumstance. We demonstrate the feasibility of this approach by evaluating our method in two realistic communication scenarios and show that it can reduce the potential for misalignment between the speaker's intentions and the listener's perceptions in both cases.
Gnani.ai launches its new speech recognition technology for Indian defense โ TechGraph
"These end-to-end voice translation system uses Automatic Speech Recognition (ASR), Machine Translation and Speech-to-Text to convert Mandarin to English and is designed to help armed forces, intelligence agencies and local law enforcement authorities in improving communication systems and giving substantial leeway to the Indian defense forces," the company in its statement said. The solution has a wide range of applications that includes cross border intelligence, voice surveillance, monitoring telephone/internet conversations, intercepting Radio/Satellite communication, and to bridge interactions during border meetings & joint exercises. Its unique features include noise reduction, dialect/accent detection, and support for all audio file formats. Speaking on the launch, Ananth Nagaraj, Co-founder & CTO, Gnani.ai said, "AI-based Speech Recognition technology is a necessity and is quickly making its way in becoming part of modern warfare. We believe AI has the potential to transform and improve the communication systems and will help strengthen Indian Armed forces." "Understanding linguistic nuances such as phoneme and dialects is a challenge especially when it comes to Mandarin.
Unsupervised Word Translation Pairing using Refinement based Point Set Registration
Oprea, Silviu, Dutta, Sourav, Assem, Haytham
Cross-lingual alignment of word embeddings play an important role in knowledge transfer across languages, for improving machine translation and other multi-lingual applications. Current unsupervised approaches rely on similarities in geometric structure of word embedding spaces across languages, to learn structure-preserving linear transformations using adversarial networks and refinement strategies. However, such techniques, in practice, tend to suffer from instability and convergence issues, requiring tedious fine-tuning for precise parameter setting. This paper proposes BioSpere, a novel framework for unsupervised mapping of bi-lingual word embeddings onto a shared vector space, by combining adversarial initialization and refinement procedure with point set registration algorithm used in image processing. We show that our framework alleviates the shortcomings of existing methodologies, and is relatively invariant to variable adversarial learning performance, depicting robustness in terms of parameter choices and training losses. Experimental evaluation on parallel dictionary induction task demonstrates state-of-the-art results for our framework on diverse language pairs.
More accurate than Google Translate? Meet the Slovenian AI startup offering quality language translations, coming to UK soon - UKTN (UK Tech News)
Speaking to UKTN, Marko Hozjan, co-founder and CEO of TAIA, explains, "TAIA helps businesses translate their content more efficiently by providing professional translators with AI assistance. Files are automatically analysed and a price quote with delivery times is available in under a minute. Users can select between a range of services and delivery times to order a translation service that best fits their needs and budget. Once the project is ordered, it's automatically translated using Neural Machine Translation and prefilled with existing translations from customers' unique Translation memory. This way your projects get translated faster and more consistently with every order. Users can monitor the progress of their project in the convenient web application and easily manage all their translation needs in one place, keeping their data secure and their costs optimised."
AGenT Zero: Zero-shot Automatic Multiple-Choice Question Generation for Skill Assessments
Li, Eric, Su, Jingyi, Sheng, Hao, Wai, Lawrence
Multiple-choice questions (MCQs) offer the most promising avenue for skill evaluation in the era of virtual education and job recruiting, where traditional performance-based alternatives such as projects and essays have become less viable, and grading resources are constrained. The automated generation of MCQs would allow assessment creation at scale. Recent advances in natural language processing have given rise to many complex question generation methods. However, the few methods that produce deployable results in specific domains require a large amount of domain-specific training data that can be very costly to acquire. Our work provides an initial foray into MCQ generation under high data-acquisition cost scenarios by strategically emphasizing paraphrasing the question context (compared to the task). In addition to maintaining semantic similarity between the question-answer pairs, our pipeline, which we call AGenT Zero, consists of only pre-trained models and requires no fine-tuning, minimizing data acquisition costs for question generation. AGenT Zero successfully outperforms other pre-trained methods in fluency and semantic similarity. Additionally, with some small changes, our assessment pipeline can be generalized to a broader question and answer space, including short answer or fill in the blank questions.
Model Compression via Pruning
To obtain fast and accurate inference on edge devices, a model has to be optimized for real-time inference. Fine-tuned state-of-the-art models like VGG16/19, ResNet50 have 138 million and 23 million parameters respectively and inference is often expensive on resource-constrained devices. Previously I've talked about one model compression technique called "Knowledge Distillation" using a smaller student network to mimic the performance of a larger teacher network (Both student and teacher network has different network architecture). Today, the focus will be on "Pruning" one model compression technique that allows us to compress the model to a smaller size with zero or marginal loss of accuracy. In short, pruning eliminates the weights with low magnitude (That does not contribute much to the final model performance).
Deep Dive in Datasets for Machine translation in NLP Using TensorFlow and PyTorch
With the advancement of machine translation, there is a recent movement towards large-scale empirical techniques that have prompted exceptionally massive enhancements in translation quality. Machine Translation is the technique of consequently changing over one characteristic language into another, saving the importance of the info text. The ongoing research on Image description presents a considerable challenge in the field of natural language processing and computer vision. To overcome this issue, multimodal machine translation presents data from other methods, for the most part, static pictures, to improve the interpretation quality. Here, we will cover the absolute most well-known datasets that are utilized in machine translation.
Artificial Intelligence's Role in the Field of Intellectual Property
Artificial intelligence (AI) has become a digital frontier that will have a profound impact on the world. It will have immense technological, economic, and social consequences and will transform the way humans work, live, and produce and distribute goods and services. Although it is too early to say, it is clear that AI will affect traditional intellectual property (IP) concepts. Commercial AI-generated music and AI-created inventions are not so far, and it is expected that it will define the concepts of the'composer', 'author', and'inventor'. But how that will happen is not clear yet.