AITopics

2007.04205

Country:

Europe > Austria > Vienna (0.14)
Europe > Finland > Pirkanmaa > Tampere (0.04)

Genre: Research Report > New Finding (0.94)

Industry: Law > Litigation (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.87)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)

#artificialintelligenceJul-2-2020, 20:30:29 GMT

IBM Research at ACL 2020

The 58th Annual Meeting of the Association for Computational Linguistics (ACL 2020), the premiere annual conference on AI and language, takes place July 5-10. As is the case with most events currently, ACL will be virtual this year due to COVID-19. At IBM Research AI, we’re excited to share with you — wherever you might be in the world — all the work we’ll have at ACL 2020 designed to advance AI for the enterprise. The ability of AI to master language has been one of IBM Research AI’s key areas of focus for years. The field of Natural Language Processing (NLP) is constantly evolving in efforts to better outfit AI with the ability to communicate similarly to how us humans can. It’s an incredibly challenging area of research. An AI must identify, decipher, and navigate through natural language barriers — tasks like slang, idioms, acronyms, different languages and extracting meaning from multi-format documents, to name a few. To tackle these challenges, IBM released earlier this year a new, four-part mastering language taxonomy…

artificial intelligence, machine translation, natural language, (14 more...)

Industry: Information Technology (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.32)

#artificialintelligenceJun-30-2020, 15:45:56 GMT

An AI Researcher's Exploration of 200 Machine Learning Tools

To better understand the landscape of available tools for machine learning production, I decided to look up every AI/ML tool I could find. After filtering out applications companies (e.g. companies that use ML to provide business analytics), tools that aren't being actively developed, and tools that nobody uses, I got 202 tools. Please let me know if there are tools you think I should include but aren't on the list yet! The landscape is under-developed IV. I categorize the tools based on which step of the workflow that it supports. I don't include Project setup since it requires project management tools, not ML tools.

artificial intelligence, machine learning, natural language, (18 more...)

Genre: Workflow (0.89)

Industry: Information Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.47)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.47)

#artificialintelligenceJun-30-2020, 08:45:40 GMT

Java To Python And Back, AI That Translates Programming Languages

The Commonwealth Bank of Australia spent around $750 million and 5 years of work to convert its platform from COBOL to Java. Migrating an existing codebase to a modern or more efficient language like Java or C requires expertise in both the source and target languages, and is often costly. Usually, a transcompiler is deployed that converts source code from a high-level programming language (such as C or Python) to another. Transcompilers are primarily used for interoperability, and to port codebases written in an obsolete or deprecated language (e.g. They typically rely on handcrafted rewrite rules, applied to the source code abstract syntax tree.

natural language, programming language, python, (11 more...)

Country: Oceania > Australia (0.26)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.78)

arXiv.org Artificial IntelligenceJun-30-2020

Correction of Faulty Background Knowledge based on Condition Aware and Revise Transformer for Question Answering

Zhao, Xinyan, Feng, Xiao, Zhong, Haoming, Yao, Jun, Chen, Huanhuan

The study of question answering has received increasing attention in recent years. This work focuses on providing an answer that compatible with both user intent and conditioning information corresponding to the question, such as delivery status and stock information in e-commerce. However, these conditions may be wrong or incomplete in real-world applications. Although existing question answering systems have considered the external information, such as categorical attributes and triples in knowledge base, they all assume that the external information is correct and complete. To alleviate the effect of defective condition values, this paper proposes condition aware and revise Transformer (CAR-Transformer). CAR-Transformer (1) revises each condition value based on the whole conversation and original conditions values, and (2) it encodes the revised conditions and utilizes the conditions embedding to select an answer. Experimental results on a real-world customer service dataset demonstrate that the CAR-Transformer can still select an appropriate reply when conditions corresponding to the question exist wrong or missing values, and substantially outperforms baseline models on automatic and human evaluations. The proposed CAR-Transformer can be extended to other NLP tasks which need to consider conditioning information.

condition value, machine learning, question answering, (21 more...)

arXiv.org Artificial Intelligence

2006.16722

Country:

Asia > China > Anhui Province > Hefei (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)
North America > United States > Tennessee (0.04)

Genre: Research Report (0.82)

Industry: Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

arXiv.org Machine LearningJun-30-2020

Adversarial Mutual Information for Text Generation

Pan, Boyuan, Yang, Yazheng, Liang, Kaizhao, Kailkhura, Bhavya, Jin, Zhongming, Hua, Xian-Sheng, Cai, Deng, Li, Bo

Recent advances in maximizing mutual information (MI) between the source and target have demonstrated its effectiveness in text generation. However, previous works paid little attention to modeling the backward network of MI (i.e., dependency from the target to the source), which is crucial to the tightness of the variational information maximization lower bound. In this paper, we propose Adversarial Mutual Information (AMI): a text generation framework which is formed as a novel saddle point (min-max) optimization aiming to identify joint interactions between the source and target. Within this framework, the forward and backward networks are able to iteratively promote or demote each other's generated instances by comparing the real and synthetic data distributions. We also develop a latent noise sampling strategy that leverages random variations at the high-level semantic space to enhance the long term dependency in the generation process. Extensive experiments based on different text generation tasks demonstrate that the proposed AMI framework can significantly outperform several strong baselines, and we also show that AMI has potential to lead to a tighter lower bound of maximum mutual information for the variational information maximization problem.

artificial intelligence, machine learning, natural language, (16 more...)

2007.00067

Country:

Europe > Austria > Vienna (0.14)
Asia > China (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Illinois > Champaign County > Urbana (0.04)

Genre: Research Report (1.00)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.71)

arXiv.org Machine LearningJun-30-2020

GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding

Lepikhin, Dmitry, Lee, HyoukJoong, Xu, Yuanzhong, Chen, Dehao, Firat, Orhan, Huang, Yanping, Krikun, Maxim, Shazeer, Noam, Chen, Zhifeng

Neural network scaling has been critical for improving the model quality in many real-world machine learning applications with vast amounts of training data and compute. Although this trend of scaling is affirmed to be a sure-fire approach for better model quality, there are challenges on the path such as the computation cost, ease of programming, and efficient implementation on parallel devices. GShard is a module composed of a set of lightweight annotation APIs and an extension to the XLA compiler. It provides an elegant way to express a wide range of parallel computation patterns with minimal changes to the existing model code. GShard enabled us to scale up multilingual neural machine translation Transformer model with Sparsely-Gated Mixture-of-Experts beyond 600 billion parameters using automatic sharding. We demonstrate that such a giant model can efficiently be trained on 2048 TPU v3 accelerators in 4 days to achieve far superior quality for translation from 100 languages to English compared to the prior art.

artificial intelligence, machine learning, natural language, (19 more...)

2006.16668

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Sun, Zhiqing, Yang, Yiming

An EM Approach to Non-autoregressive Conditional Sequence Generation

arXiv.org Machine LearningJun-29-2020

Autoregressive (AR) models have been the dominating approach to conditional sequence generation, but are suffering from the issue of high inference latency. Non-autoregressive (NAR) models have been recently proposed to reduce the latency by generating all output tokens in parallel but could only achieve inferior accuracy compared to their autoregressive counterparts, primarily due to a difficulty in dealing with the multi-modality in sequence generation. This paper proposes a new approach that jointly optimizes both AR and NAR models in a unified Expectation-Maximization (EM) framework. In the E-step, an AR model learns to approximate the regularized posterior of the NAR model. In the M-step, the NAR model is updated on the new posterior and selects the training examples for the next AR model. This iterative process can effectively guide the system to remove the multi-modality in the output sequences. To our knowledge, this is the first EM approach to NAR sequence generation. We evaluate our method on the task of machine translation. Experimental results on benchmark data sets show that the proposed approach achieves competitive, if not better, performance with existing NAR models and significantly reduces the inference latency.

arxiv preprint arxiv, nar model, translation, (10 more...)

2006.16378

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
Europe > Austria > Vienna (0.14)
Oceania > Australia > Australian Capital Territory > Canberra (0.05)
Asia > Thailand > Krabi > Krabi (0.05)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.86)

Cordonnier, Jean-Baptiste, Loukas, Andreas, Jaggi, Martin

Multi-Head Attention: Collaborate Instead of Concatenate

arXiv.org Machine LearningJun-29-2020

Attention layers are widely used in natural language processing (NLP) and are beginning to influence computer vision architectures. However, they suffer from over-parameterization. For instance, it was shown that the majority of attention heads could be pruned without impacting accuracy. This work aims to enhance current understanding on how multiple heads interact. Motivated by the observation that trained attention heads share common key/query projections, we propose a collaborative multi-head attention layer that enables heads to learn shared projections. Our scheme improves the computational cost and number of parameters in an attention layer and can be used as a drop-in replacement in any transformer architecture. For instance, by allowing heads to collaborate on a neural machine translation task, we can reduce the key dimension by a factor of eight without any loss in performance. We also show that it is possible to re-parametrize a pre-trained multi-head attention layer into our collaborative attention layer. Even without retraining, collaborative multi-head attention manages to reduce the size of the key and query projections by half without sacrificing accuracy.

artificial intelligence, machine learning, natural language, (18 more...)

2006.16362

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Switzerland > Vaud > Lausanne (0.04)
Africa > Senegal > Kolda Region > Kolda (0.04)
(4 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

#artificialintelligenceJun-28-2020, 11:45:05 GMT

In English, Machine Translation Makes You Sound Like a Man in His Middle Age

MARKETING 24/06/2020 In English, Machine Translation Makes You Sound Like a Man in His Middle Age THREE BOCCONI SCHOLARS FOUND AN ALGORITHMIC BIAS IN THE SYSTEMS OF GOOGLE, BING, AND DEEPL, WHEN TRANSLATING FROM SEVERAL EUROPEAN LANGUAGES INTO ENGLISH Imagine a child raised in a village inhabited only by middle-aged men. For the first ten years of her life, she only hears males in their 60s talking of work, books, sports, health, and money. What kind of weird language do you think she will speak when she leaves the village? Something similar happens to the most common machine translation systems, according to a new study by Dirk Hovy, an Associate Professor of Computer Science at Bocconi, and two Postdoctoral Researchers in his lab, Federico Bianchi and Tommaso Fornaciari. To train a translation system based on machine learning, you feed it with large amounts of texts and let it learn by experience.

artificial intelligence, natural language, translation, (14 more...)

Genre: Research Report > New Finding (0.37)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)