AITopics

2402.11968

Country:

Europe > Luxembourg (0.14)
Europe > Switzerland (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
(21 more...)

Genre:

Questionnaire & Opinion Survey (1.00)
Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.70)

Ansar, Wazib, Goswami, Saptarsi, Chakrabarti, Amlan

TexIm FAST: Text-to-Image Representation for Semantic Similarity Evaluation using Transformers

One of the principal objectives of Natural Language Processing (NLP) is to generate meaningful representations from text. Improving the informativeness of the representations has led to a tremendous rise in the dimensionality and the memory footprint. It leads to a cascading effect amplifying the complexity of the downstream model by increasing its parameters. The available techniques cannot be applied to cross-modal applications such as text-to-image. To ameliorate these issues, a novel Text-to-Image methodology for generating fixed-length representations through a self-supervised Variational Auto-Encoder (VAE) for semantic evaluation applying transformers (TexIm FAST) has been proposed in this paper. The pictorial representations allow oblivious inference while retaining the linguistic intricacies, and are potent in cross-modal applications. TexIm FAST deals with variable-length sequences and generates fixed-length representations with over 75% reduced memory footprint. It enhances the efficiency of the models for downstream tasks by reducing its parameters. The efficacy of TexIm FAST has been extensively analyzed for the task of Semantic Textual Similarity (STS) upon the MSRPC, CNN/ Daily Mail, and XSum data-sets. The results demonstrate 6% improvement in accuracy compared to the baseline and showcase its exceptional ability to compare disparate length sequences such as a text with its summary.

representation, sequence, texim fast, (15 more...)

2406.04438

Country:

Asia > India > West Bengal > Kolkata (0.14)
North America > United States (0.14)
Asia > Singapore (0.04)
(3 more...)

Genre:

Research Report (0.70)
Overview (0.68)

Industry:

Transportation > Passenger (1.00)
Transportation > Air (1.00)
Government (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Evaluating the IWSLT2023 Speech Translation Tasks: Human Annotations, Automatic Metrics, and Segmentation

Sperber, Matthias, Bojar, Ondřej, Haddow, Barry, Javorský, Dávid, Ma, Xutai, Negri, Matteo, Niehues, Jan, Polák, Peter, Salesky, Elizabeth, Sudoh, Katsuhito, Turchi, Marco

Human evaluation is a critical component in machine translation system development and has received much attention in text translation research. However, little prior work exists on the topic of human evaluation for speech translation, which adds additional challenges such as noisy data and segmentation mismatches. We take first steps to fill this gap by conducting a comprehensive human evaluation of the results of several shared tasks from the last International Workshop on Spoken Language Translation (IWSLT 2023). We propose an effective evaluation strategy based on automatic resegmentation and direct assessment with segment context. Our analysis revealed that: 1) the proposed evaluation strategy is robust and scores well-correlated with other types of human judgements; 2) automatic metrics are usually, but not always, well-correlated with direct assessment scores; and 3) COMET as a slightly stronger automatic metric than chrF, despite the segmentation noise introduced by the resegmentation step systems. We release the collected human-annotated data in order to encourage further investigation.

correlation, evaluation, translation, (14 more...)

2406.03881

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
North America > Canada > Ontario > Toronto (0.04)
(7 more...)

Genre: Research Report > New Finding (0.30)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Singh, Anushka, Sai, Ananya B., Dabre, Raj, Puduppully, Ratish, Kunchukuttan, Anoop, Khapra, Mitesh M

How Good is Zero-Shot MT Evaluation for Low Resource Indian Languages?

While machine translation evaluation has been studied primarily for high-resource languages, there has been a recent interest in evaluation for low-resource languages due to the increasing availability of data and models. In this paper, we focus on a zero-shot evaluation setting focusing on low-resource Indian languages, namely Assamese, Kannada, Maithili, and Punjabi. We collect sufficient Multi-Dimensional Quality Metrics (MQM) and Direct Assessment (DA) annotations to create test sets and meta-evaluate a plethora of automatic evaluation metrics. We observe that even for learned metrics, which are known to exhibit zero-shot performance, the Kendall Tau and Pearson correlations with human annotations are only as high as 0.32 and 0.45. Synthetic data approaches show mixed results and overall do not help close the gap by much for these languages. This indicates that there is still a long way to go for low-resource evaluation.

computational linguistic, evaluation, synthetic data, (15 more...)

2406.03893

Country:

Asia > India (0.15)
Asia > Singapore (0.04)
North America > United States > Pennsylvania (0.04)
(9 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.91)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.90)

Wicks, Rachel, Post, Matt, Koehn, Philipp

Recovering document annotations for sentence-level bitext

Data availability limits the scope of any given task. In machine translation, historical models were incapable of handling longer contexts, so the lack of document-level datasets was less noticeable. Now, despite the emergence of long-sequence methods, we remain within a sentence-level paradigm and without data to adequately approach context-aware machine translation. Most large-scale datasets have been processed through a pipeline that discards document-level metadata. In this work, we reconstruct document-level information for three (ParaCrawl, News Commentary, and Europarl) large datasets in German, French, Spanish, Italian, Polish, and Portuguese (paired with English). We then introduce a document-level filtering technique as an alternative to traditional bitext filtering. We present this filtering with analysis to show that this method prefers context-consistent translations rather than those that may have been sentence-level machine translated. Last we train models on these longer contexts and demonstrate improvement in document-level translation without degradation of sentence-level translation. We release our dataset, ParaDocs, and resulting models as a resource to the community.

computational linguistic, machine translation, translation, (13 more...)

2406.03869

Country:

Asia > Singapore (0.05)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(16 more...)

Genre: Research Report > Experimental Study (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Deng, Keqi, Woodland, Philip C.

Label-Synchronous Neural Transducer for E2E Simultaneous Speech Translation

While the neural transducer is popular for online speech recognition, simultaneous speech translation (SST) requires both streaming and re-ordering capabilities. This paper presents the LS-Transducer-SST, a label-synchronous neural transducer for SST, which naturally possesses these two properties. The LS-Transducer-SST dynamically decides when to emit translation tokens based on an Auto-regressive Integrate-and-Fire (AIF) mechanism. A latency-controllable AIF is also proposed, which can control the quality-latency trade-off either only during decoding, or it can be used in both decoding and training. The LS-Transducer-SST can naturally utilise monolingual text-only data via its prediction network which helps alleviate the key issue of data sparsity for E2E SST. During decoding, a chunk-based incremental joint decoding technique is designed to refine and expand the search space. Experiments on the Fisher-CallHome Spanish (Es-En) and MuST-C En-De data show that the LS-Transducer-SST gives a better quality-latency trade-off than existing popular methods. For example, the LS-Transducer-SST gives a 3.1/2.9 point BLEU increase (Es-En/En-De) relative to CAAT at a similar latency and a 1.4 s reduction in average lagging latency with similar BLEU scores relative to Wait-k.

latency, prediction network, translation, (12 more...)

2406.04541

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > Canada > Ontario > Toronto (0.04)
(15 more...)

Genre: Research Report (0.64)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Pre-trained Transformer Uncovers Meaningful Patterns in Human Mobility Data

Najjar, Alameen

We empirically demonstrate that a transformer pre-trained on country-scale unlabeled human mobility data learns embeddings capable, through fine-tuning, of developing a deep understanding of the target geography and its corresponding mobility patterns. Utilizing an adaptation framework, we evaluate the performance of our pre-trained embeddings in encapsulating a broad spectrum of concepts directly and indirectly related to human mobility. This includes basic notions, such as geographic location and distance, and extends to more complex constructs, such as administrative divisions and land cover. Our extensive empirical analysis reveals a substantial performance boost gained from pre-training, reaching up to 38% in tasks such as tree-cover regression. We attribute this result to the ability of the pre-training to uncover meaningful patterns hidden in the raw data, beneficial for modeling relevant Figure 1: A transformer pre-trained from scratch on countryscale high-level concepts. The pre-trained embeddings emerge as robust unlabeled human mobility data is adapted to model a representations of regions and trajectories, potentially valuable for variety of high-level concepts manifesting at different levels a wide range of downstream applications.

bert pre-trained, pre-trained, trajectory, (12 more...)

2406.04029

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.14)
North America > United States > New York (0.04)
(4 more...)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Guo, Shoutao, Zhang, Shaolei, Feng, Yang

Decoder-only Streaming Transformer for Simultaneous Translation

Simultaneous Machine Translation (SiMT) generates translation while reading source tokens, essentially producing the target prefix based on the source prefix. To achieve good performance, it leverages the relationship between source and target prefixes to exact a policy to guide the generation of translations. Although existing SiMT methods primarily focus on the Encoder-Decoder architecture, we explore the potential of Decoder-only architecture, owing to its superior performance in various tasks and its inherent compatibility with SiMT. However, directly applying the Decoder-only architecture to SiMT poses challenges in terms of training and inference. To alleviate the above problems, we propose the first Decoder-only SiMT model, named Decoder-only Streaming Transformer (DST). Specifically, DST separately encodes the positions of the source and target prefixes, ensuring that the position of the target prefix remains unaffected by the expansion of the source prefix. Furthermore, we propose a Streaming Self-Attention (SSA) mechanism tailored for the Decoder-only architecture. It is capable of obtaining translation policy by assessing the sufficiency of input source information and integrating with the soft-attention mechanism to generate translations. Experiments demonstrate that our approach achieves state-of-the-art performance on three translation tasks.

architecture, computational linguistic, translation, (15 more...)

2406.03878

Country:

Asia > Singapore (0.04)
North America > Dominican Republic (0.04)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.04)
(17 more...)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Kim, Taehyeon, Suresh, Ananda Theertha, Papineni, Kishore, Riley, Michael, Kumar, Sanjiv, Benton, Adrian

Exploring and Improving Drafts in Blockwise Parallel Decoding

arXiv.org Artificial IntelligenceJun-5-2024

Despite the remarkable strides made by autoregressive language models, their potential is often hampered by the slow inference speeds inherent in sequential token generation. Blockwise parallel decoding (BPD) was proposed by Stern et al. [38] as a method to improve inference speed of language models by simultaneously predicting multiple future tokens, termed block drafts, which are subsequently verified and conditionally accepted by the autoregressive model. This paper contributes to the understanding and improvement of block drafts in two ways. First, we analyze the token distributions produced by multiple prediction heads. Secondly, we leverage this analysis to develop algorithms to improve BPD inference speed by refining the block drafts using n-gram and neural language models. Experiments demonstrate that refined block drafts yield a +5-21% increase in block efficiency (i.e., the number of accepted tokens from the block draft) across diverse datasets.

arxiv preprint arxiv, block efficiency, lattice, (11 more...)

2404.09221

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Europe > Czechia (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.67)

arXiv.org Artificial IntelligenceJun-5-2024

LCS: A Language Converter Strategy for Zero-Shot Neural Machine Translation

Sun, Zengkui, Liu, Yijin, Meng, Fandong, Xu, Jinan, Chen, Yufeng, Zhou, Jie

Multilingual neural machine translation models generally distinguish translation directions by the language tag (LT) in front of the source or target sentences. However, current LT strategies cannot indicate the desired target language as expected on zero-shot translation, i.e., the off-target issue. Our analysis reveals that the indication of the target language is sensitive to the placement of the target LT. For example, when placing the target LT on the decoder side, the indication would rapidly degrade along with decoding steps, while placing the target LT on the encoder side would lead to copying or paraphrasing the source input. To address the above issues, we propose a simple yet effective strategy named Language Converter Strategy (LCS). By introducing the target language embedding into the top encoder layers, LCS mitigates confusion in the encoder and ensures stable language indication for the decoder. Experimental results on MultiUN, TED, and OPUS-100 datasets demonstrate that LCS could significantly mitigate the off-target issue, with language accuracy up to 95.28%, 96.21%, and 85.35% meanwhile outperforming the vanilla LT strategy by 3.07, 3,3, and 7.93 BLEU scores on zero-shot translation, respectively.

computational linguistic, translation, zero-shot translation, (16 more...)

2406.02876

Country:

North America > Dominican Republic (0.04)
Asia > China > Beijing > Beijing (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
(7 more...)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)