AITopics | Sumita, Eiichiro

Collaborating Authors

Sumita, Eiichiro

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

SelfSeg: A Self-supervised Sub-word Segmentation Method for Neural Machine Translation

Song, Haiyue, Dabre, Raj, Chu, Chenhui, Kurohashi, Sadao, Sumita, Eiichiro

arXiv.org Artificial IntelligenceJul-31-2023

Sub-word segmentation is an essential pre-processing step for Neural Machine Translation (NMT). Existing work has shown that neural sub-word segmenters are better than Byte-Pair Encoding (BPE), however, they are inefficient as they require parallel corpora, days to train and hours to decode. This paper introduces SelfSeg, a self-supervised neural sub-word segmentation method that is much faster to train/decode and requires only monolingual dictionaries instead of parallel corpora. SelfSeg takes as input a word in the form of a partially masked character sequence, optimizes the word generation probability and generates the segmentation with the maximum posterior probability, which is calculated using a dynamic programming algorithm. The training time of SelfSeg depends on word frequencies, and we explore several word frequency normalization strategies to accelerate the training phase. Additionally, we propose a regularization mechanism that allows the segmenter to generate various segmentations for one word. To show the effectiveness of our approach, we conduct MT experiments in low-, middle- and high-resource scenarios, where we compare the performance of using different segmentation methods. The experimental results demonstrate that on the low-resource ALT dataset, our method achieves more than 1.2 BLEU score improvement compared with BPE and SentencePiece, and a 1.1 score improvement over Dynamic Programming Encoding (DPE) and Vocabulary Learning via Optimal Transport (VOLT) on average. The regularization method achieves approximately a 4.3 BLEU score improvement over BPE and a 1.2 BLEU score improvement over BPE-dropout, the regularized version of BPE. We also observed significant improvements on IWSLT15 Vi->En, WMT16 Ro->En and WMT15 Fi->En datasets, and competitive results on the WMT14 De->En and WMT14 Fr->En datasets.

artificial intelligence, natural language, segmentation, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3610611

2307.164

Country:

Europe (1.00)
Asia (0.69)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Oceania > Australia > New South Wales > Sydney (0.14)

Genre: Research Report > New Finding (0.66)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Universal Multimodal Representation for Language Understanding

Zhang, Zhuosheng, Chen, Kehai, Wang, Rui, Utiyama, Masao, Sumita, Eiichiro, Li, Zuchao, Zhao, Hai

arXiv.org Artificial IntelligenceJan-9-2023

Representation learning is the foundation of natural language processing (NLP). This work presents new methods to employ visual information as assistant signals to general NLP tasks. For each sentence, we first retrieve a flexible number of images either from a light topic-image lookup table extracted over the existing sentence-image pairs or a shared cross-modal embedding space that is pre-trained on out-of-shelf text-image pairs. Then, the text and images are encoded by a Transformer encoder and convolutional neural network, respectively. The two sequences of representations are further fused by an attention layer for the interaction of the two modalities. In this study, the retrieval process is controllable and flexible. The universal visual representation overcomes the lack of large-scale bilingual sentence-image pairs. Our method can be easily applied to text-only tasks without manually annotated multimodal parallel corpora. We apply the proposed method to a wide range of natural language generation and understanding tasks, including neural machine translation, natural language inference, and semantic similarity. Experimental results show that our method is generally effective for different tasks and languages. Analysis indicates that the visual signals enrich textual representations of content words, provide fine-grained grounding information about the relationship between concepts and events, and potentially conduce to disambiguation.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TPAMI.2023.3234170

2301.03344

Country:

North America > United States (1.00)
Asia > China (1.00)
Asia > Japan > Honshū (0.28)

Genre: Research Report > New Finding (0.86)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback

Language Model Pre-training on True Negatives

Zhang, Zhuosheng, Zhao, Hai, Utiyama, Masao, Sumita, Eiichiro

arXiv.org Artificial IntelligenceDec-1-2022

Discriminative pre-trained language models (PLMs) learn to predict original texts from intentionally corrupted ones. Taking the former text as positive and the latter as negative samples, the PLM can be trained effectively for contextualized representation. However, the training of such a type of PLMs highly relies on the quality of the automatically constructed samples. Existing PLMs simply treat all corrupted texts as equal negative without any examination, which actually lets the resulting model inevitably suffer from the false negative issue where training is carried out on pseudo-negative data and leads to less efficiency and less robustness in the resulting PLMs. In this work, on the basis of defining the false negative issue in discriminative PLMs that has been ignored for a long time, we design enhanced pre-training methods to counteract false negative predictions and encourage pre-training language models on true negatives by correcting the harmful gradient updates subject to false negative predictions. Experimental results on GLUE and SQuAD benchmarks show that our counter-false-negative pre-training methods indeed bring about better performance together with stronger robustness.

artificial intelligence, machine learning, prediction, (17 more...)

arXiv.org Artificial Intelligence

2212.0046

Country:

North America > United States (1.00)
Europe (0.67)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Extending the Subwording Model of Multilingual Pretrained Models for New Languages

Imamura, Kenji, Sumita, Eiichiro

arXiv.org Artificial IntelligenceNov-29-2022

Multilingual pretrained models are effective for machine translation and cross-lingual processing because they contain multiple languages in one model. However, they are pretrained after their tokenizers are fixed; therefore it is difficult to change the vocabulary after pretraining. When we extend the pretrained models to new languages, we must modify the tokenizers simultaneously. In this paper, we add new subwords to the SentencePiece tokenizer to apply a multilingual pretrained model to new languages (Inuktitut in this paper). In our experiments, we segmented Inuktitut sentences into subwords without changing the segmentation of already pretrained languages, and applied the mBART-50 pretrained model to English-Inuktitut translation.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2211.15965

Country:

Europe (0.94)
North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Smoothing Dialogue States for Open Conversational Machine Reading

Zhang, Zhuosheng, Ouyang, Siru, Zhao, Hai, Utiyama, Masao, Sumita, Eiichiro

arXiv.org Artificial IntelligenceSep-2-2021

Conversational machine reading (CMR) requires machines to communicate with humans through multi-turn interactions between two salient dialogue states of decision making and question generation processes. In open CMR settings, as the more realistic scenario, the retrieved background knowledge would be noisy, which results in severe challenges in the information transmission. Existing studies commonly train independent or pipeline systems for the two subtasks. However, those methods are trivial by using hard-label decisions to activate question generation, which eventually hinders the model performance. In this work, we propose an effective gating strategy by smoothing the two dialogue states in only one decoder and bridge decision making and question generation to provide a richer dialogue state reference. Experiments on the OR-ShARC dataset show the effectiveness of our method, which achieves new state-of-the-art results.

artificial intelligence, computational linguistics, natural language, (17 more...)

arXiv.org Artificial Intelligence

2108.12599

Country:

Asia (0.94)
Europe (0.68)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.82)

Industry: Energy > Oil & Gas (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.79)

Add feedback

YANMTT: Yet Another Neural Machine Translation Toolkit

Dabre, Raj, Sumita, Eiichiro

arXiv.org Artificial IntelligenceAug-25-2021

In this paper we present our open-source neural machine translation (NMT) toolkit called "Yet Another Neural Machine Translation Toolkit" abbreviated as YANMTT which is built on top of the Transformers library. Despite the growing importance of sequence to sequence pre-training there surprisingly few, if not none, well established toolkits that allow users to easily do pre-training. Toolkits such as Fairseq which do allow pre-training, have very large codebases and thus they are not beginner friendly. With regards to transfer learning via fine-tuning most toolkits do not explicitly allow the user to have control over what parts of the pre-trained models can be transferred. YANMTT aims to address these issues via the minimum amount of code to pre-train large scale NMT models, selectively transfer pre-trained parameters and fine-tune them, perform translation as well as extract representations and attentions for visualization and analyses. Apart from these core features our toolkit also provides other advanced functionalities such as but not limited to document/multi-source NMT, simultaneous NMT and model compression via distillation which we believe are relevant to the purpose behind our toolkit.

artificial intelligence, machine learning, natural language, (2 more...)

arXiv.org Artificial Intelligence

2108.11126

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Syntax-Directed Attention for Neural Machine Translation

Chen, Kehai (Harbin Institute of Technology) | Wang, Rui (National Institute of Information and Communications Technology) | Utiyama, Masao (National Institute of Information and Communications Technology) | Sumita, Eiichiro (National Institute of Information and Communications Technology) | Zhao, Tiejun (Harbin Institute of Technology)

AAAI ConferencesFeb-8-2018

Attention mechanism, including global attention and local attention, plays a key role in neural machine translation (NMT). Global attention attends to all source words for word prediction. In comparison, local attention selectively looks at fixed-window source words. However, alignment weights for the current target word often decrease to the left and right by linear distance centering on the aligned source position and neglect syntax distance constraints. In this paper, we extend the local attention with syntax-distance constraint, which focuses on syntactically related source words with the predicted target word to learning a more effective context vector for predicting translation. Moreover, we further propose a double context NMT architecture, which consists of a global context vector and a syntax-directed context vector from the global attention, to provide more translation performance for NMT from source representation. The experiments on the large-scale Chinese-to-English and English-to-German translation tasks show that the proposed approach achieves a substantial and significant improvement over the baseline system.

artificial intelligence, machine translation, source word, (17 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country:

Europe (1.00)
North America > United States > Michigan (0.14)
Asia > Middle East > Qatar (0.14)

Genre: Research Report > Experimental Study (0.47)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Deterministic Attention for Sequence-to-Sequence Constituent Parsing

Ma, Chunpeng (National Institute of Information and Communications Technology) | Liu, Lemao (National Institute of Information and Communications Technology) | Tamura, Akihiro (National Institute of Information and Communications Technology) | Zhao, Tiejun (Harbin Institute of Technology) | Sumita, Eiichiro (National Institute of Information and Communications Technology)

AAAI ConferencesFeb-14-2017

The sequence-to-sequence model is proven to be extremely successful in constituent parsing. It relies on one key technique, the probabilistic attention mechanism, to automatically select the context for prediction. Despite its successes, the probabilistic attention model does not always select the most important context. For example, the headword and boundary words of a subtree have been shown to be critical when predicting the constituent label of the subtree, but this contextual information becomes increasingly difficult to learn as the length of the sequence increases. In this study, we proposed a deterministic attention mechanism that deterministically selects the important context and is not affected by the sequence length. We implemented two different instances of this framework. When combined with a novel bottom-up linearization method, our parser demonstrated better performance than that achieved by the sequence-to-sequence parser with probabilistic attention mechanism.

constituent, deep learning, neural network, (22 more...)

AAAI Conferences

Thirty-First AAAI Conference on Artificial Intelligence

Country:

Asia > China (0.14)
Asia > Japan (0.14)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Agreement on Target-Bidirectional LSTMs for Sequence-to-Sequence Learning

Liu, Lemao (National Institute of Information and Communications Technology) | Finch, Andrew (National Institute of Information and Communications Technology) | Utiyama, Masao (National Institute of Information and Communications Technology) | Sumita, Eiichiro (National Institute of Information and Communications Technology)

AAAI ConferencesApr-19-2016

Recurrent neural networks, particularly the long short- term memory networks, are extremely appealing for sequence-to-sequence learning tasks. Despite their great success, they typically suffer from a fundamental short- coming: they are prone to generate unbalanced targets with good prefixes but bad suffixes, and thus perfor- mance suffers when dealing with long sequences. We propose a simple yet effective approach to overcome this shortcoming. Our approach relies on the agreement between a pair of target-directional LSTMs, which generates more balanced targets. In addition, we develop two efficient approximate search methods for agreement that are empirically shown to be almost optimal in terms of sequence-level losses. Extensive experiments were performed on two standard sequence-to-sequence trans- duction tasks: machine transliteration and grapheme-to- phoneme transformation. The results show that the proposed approach achieves consistent and substantial im- provements, compared to six state-of-the-art systems. In particular, our approach outperforms the best reported error rates by a margin (up to 9% relative gains) on the grapheme-to-phoneme task.

deep learning, neural network, sequence, (19 more...)

AAAI Conferences

Thirtieth AAAI Conference on Artificial Intelligence

Country: Asia > Japan (0.14)

Genre: Research Report > New Finding (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Speaking Louder than Words with Pictures Across Languages

Finch, Andrew (NICT) | Song, Wei (Canon Inc.) | Tanaka-Ishii, Kumiko (Kyushu University) | Sumita, Eiichiro (NICT)

AI MagazineJul-5-2013

In this article, we investigate the possibility of cross-language communication using a synergy of words and pictures on mobile devices. Communicating with only pictures is in itself a very powerful strategy, but is limited in expressiveness. On the other hand, words can express everything you could wish to say, but they are cumbersome to work with on mobile devices, and need to be translated in order for their meaning to be understood. Automatic translations can contain errors that pervert the communication process, and this may undermine the users’ confidence when expressing themselves across language barriers. Our idea is to create a user interface for cross-language communication that uses pictures as the primary mode of input, and words to express the detailed meaning. This interface creates a visual process of communication that occurs on two heterogeneous channels that can support each other. We implemented this user interface as application on the Apple iPad tablet, and performed a set of experiments to determine its usefulness as a translation aid for travellers.

artificial intelligence, machine translation, sequence, (15 more...)

AI Magazine

Country: Asia > Japan > Honshū (0.14)

Industry: Education (0.46)

Technology:

Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback