AITopics

2403.01479

Country:

North America > United States > Washington > King County > Seattle (0.04)
Asia > South Korea (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Koraş, Osman Alperen, Schlötterer, Jörg, Seifert, Christin

A Second Look on BASS -- Boosting Abstractive Summarization with Unified Semantic Graphs -- A Replication Study

arXiv.org Artificial IntelligenceMar-25-2024

We present a detailed replication study of the BASS framework, an abstractive summarization system based on the notion of Unified Semantic Graphs. Our investigation includes challenges in replicating key components and an ablation study to systematically isolate error sources rooted in replicating novel components. Our findings reveal discrepancies in performance compared to the original work. We highlight the significance of paying careful attention even to reasonably omitted details for replicating advanced frameworks like BASS, and emphasize key practices for writing replicable papers.

bass, computational linguistic, information, (15 more...)

2403.0293

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Germany (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(3 more...)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.71)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.47)

arXiv.org Artificial IntelligenceMar-23-2024

User-Side Realization

Sato, Ryoma

Users are dissatisfied with services. Since the service is not tailor-made for a user, it is natural for dissatisfaction to arise. The problem is, that even if users are dissatisfied, they often do not have the means to resolve their dissatisfaction. The user cannot alter the source code of the service, nor can they force the service provider to change. The user has no choice but to remain dissatisfied or quit the service. User-side realization offers proactive solutions to this problem by providing general algorithms to deal with common problems on the user's side. These algorithms run on the user's side and solve the problems without having the service provider change the service itself.

2403.15757

Country:

Europe > Netherlands > North Holland > Amsterdam (0.04)
Asia > Middle East > Republic of Türkiye (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(5 more...)

Genre:

Overview (0.92)
Research Report > New Finding (0.67)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)
Information Technology > Services (1.00)
(4 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Security & Privacy (1.00)
Information Technology > Information Management > Search (1.00)
(15 more...)

arXiv.org Artificial IntelligenceMar-22-2024

Towards Knowledge-Grounded Natural Language Understanding and Generation

Whitehouse, Chenxi

This thesis investigates how natural language understanding and generation with transformer models can benefit from grounding the models with knowledge representations. Currently, the most prevailing paradigm for training language models is through pre-training on abundant raw text data and fine-tuning on downstream tasks. Although language models continue to advance, especially the recent trend of Large Language Models (LLMs) such as ChatGPT, there seem to be limits to what can be achieved with text data alone and it is desirable to study the impact of applying and integrating rich forms of knowledge representation to improve model performance. The most widely used form of knowledge for language modelling is structured knowledge in the form of triples consisting of entities and their relationships, often in English. This thesis explores beyond this conventional approach and aims to address several key questions: Can knowledge of entities extend its benefits beyond entity-centric tasks such as entity linking? How can we faithfully and effectively extract such structured knowledge from raw text, especially noisy web text? How do other types of knowledge, beyond structured knowledge, contribute to improving NLP tasks?

generative information extraction, multilingual entity knowledge, multimodal encoder-decoder model, (17 more...)

2403.15364

Country:

Europe > United Kingdom > England > Greater London > London (0.27)
North America > United States > Washington > King County > Seattle (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
(40 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Media (1.00)
Leisure & Entertainment (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
(4 more...)

Reversible Jump Attack to Textual Classifiers with Modification Reduction

Ni, Mingze, Sun, Zhensu, Liu, Wei

Recent studies on adversarial examples expose vulnerabilities of natural language processing (NLP) models. Existing techniques for generating adversarial examples are typically driven by deterministic hierarchical rules that are agnostic to the optimal adversarial examples, a strategy that often results in adversarial samples with a suboptimal balance between magnitudes of changes and attack successes. To this end, in this research we propose two algorithms, Reversible Jump Attack (RJA) and Metropolis-Hasting Modification Reduction (MMR), to generate highly effective adversarial examples and to improve the imperceptibility of the examples, respectively. RJA utilizes a novel randomization mechanism to enlarge the search space and efficiently adapts to a number of perturbed words for adversarial examples. With these generated adversarial examples, MMR applies the Metropolis-Hasting sampler to enhance the imperceptibility of adversarial examples. Extensive experiments demonstrate that RJA-MMR outperforms current state-of-the-art methods in attack performance, imperceptibility, fluency and grammar correctness.

adversarial example, reversible jump attack, springer nature 2021, (12 more...)

2403.14731

Country:

Asia > China > Shanghai > Shanghai (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(5 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.88)
(5 more...)

Ward, Nigel G., Marco, Divette

A Collection of Pragmatic-Similarity Judgments over Spoken Dialog Utterances

Automatic measures of similarity between utterances are invaluable for training speech synthesizers, evaluating machine translation, and assessing learner productions. While there exist measures for semantic similarity and prosodic similarity, there are as yet none for pragmatic similarity. To enable the training of such measures, we developed the first collection of human judgments of pragmatic similarity between utterance pairs. Each pair consisting of an utterance extracted from a recorded dialog and a re-enactment of that utterance. Re-enactments were done under various conditions designed to create a variety of degrees of similarity. Each pair was rated on a continuous scale by 6 to 9 judges. The average inter-judge correlation was as high as 0.72 for English and 0.66 for Spanish.

2403.14808

Country: North America > United States > Texas > El Paso County > El Paso (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.66)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.46)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.41)

Building Accurate Translation-Tailored LLMs with Language Aware Instruction Tuning

Zan, Changtong, Ding, Liang, Shen, Li, Zhen, Yibing, Liu, Weifeng, Tao, Dacheng

Translation-tailored Large language models (LLMs) exhibit remarkable translation capabilities, even competing with supervised-trained commercial translation systems. However, off-target translation remains an unsolved problem, especially for low-resource languages, hindering us from developing accurate LLMs-based translation models. To mitigate the off-target translation problem and enhance the performance of LLMs on translation, recent works have either designed advanced prompting strategies to highlight the functionality of translation instructions or exploited the in-context learning ability of LLMs by feeding few-shot demonstrations. However, these methods essentially do not improve LLM's ability to follow translation instructions, especially the language direction information. In this work, we design a two-stage fine-tuning algorithm to improve the instruction-following ability (especially the translation direction) of LLMs. Specifically, we first tune LLMs with the maximum likelihood estimation loss on the translation dataset to elicit the basic translation capabilities. In the second stage, we construct instruction-conflicting samples by randomly replacing the translation directions with a wrong one within the instruction, and then introduce an extra unlikelihood loss to learn those samples. Experiments on IWSLT and WMT benchmarks upon the LLaMA model spanning 16 zero-shot directions show that, compared to the competitive baseline -- translation-finetuned LLama, our method could effectively reduce the off-target translation ratio (averagely -53.3\%), thus improving translation quality with average +5.7 SacreBLEU and +16.4 BLEURT. Analysis shows that our method could preserve the model's general task performance on AlpacaEval. Code and models will be released at \url{https://github.com/alphadl/LanguageAware_Tuning}.

arxiv preprint, instruction-conflicting sample, translation, (15 more...)

2403.14399

Country: Asia > China (0.05)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.54)

Rice, Enora, Marashian, Ali, Gessler, Luke, Palmer, Alexis, von der Wense, Katharina

TAMS: Translation-Assisted Morphological Segmentation

Canonical morphological segmentation is the process of analyzing words into the standard (aka underlying) forms of their constituent morphemes. This is a core task in language documentation, and NLP systems have the potential to dramatically speed up this process. But in typical language documentation settings, training data for canonical morpheme segmentation is scarce, making it difficult to train high quality models. However, translation data is often much more abundant, and, in this work, we present a method that attempts to leverage this data in the canonical segmentation task. We propose a character-level sequence-to-sequence model that incorporates representations of translations obtained from pretrained high-resource monolingual language models as an additional signal. Our model outperforms the baseline in a super-low resource setting but yields mixed results on training splits with more data. While further work is needed to make translations useful in higher-resource settings, our model shows promise in severely resource-constrained settings.

computational linguistic, segmentation, translation, (16 more...)

2403.1484

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Colorado > Boulder County > Boulder (0.14)
North America > Canada > Ontario > Toronto (0.04)
(10 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.33)

From Handcrafted Features to LLMs: A Brief Survey for Machine Translation Quality Estimation

Zhao, Haofei, Liu, Yilun, Tao, Shimin, Meng, Weibin, Chen, Yimeng, Geng, Xiang, Su, Chang, Zhang, Min, Yang, Hao

Machine Translation Quality Estimation (MTQE) is the task of estimating the quality of machine-translated text in real time without the need for reference translations, which is of great importance for the development of MT. After two decades of evolution, QE has yielded a wealth of results. This article provides a comprehensive overview of QE datasets, annotation methods, shared tasks, methodologies, challenges, and future research directions. It begins with an introduction to the background and significance of QE, followed by an explanation of the concepts and evaluation metrics for word-level QE, sentence-level QE, document-level QE, and explainable QE. The paper categorizes the methods developed throughout the history of QE into those based on handcrafted features, deep learning, and Large Language Models (LLMs), with a further division of deep learning-based methods into classic deep learning and those incorporating pre-trained language models (LMs). Additionally, the article details the advantages and limitations of each method and offers a straightforward comparison of different approaches. Finally, the paper discusses the current challenges in QE research and provides an outlook on future research directions.

estimation, quality estimation, translation, (14 more...)

2403.14118

Country:

Asia > China > Jiangsu Province > Nanjing (0.04)
Asia > China > Liaoning Province > Shenyang (0.04)

Genre:

Research Report (1.00)
Overview (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Zhou, Fan, Vandeghinste, Vincent

Prediction of Translation Techniques for the Translation Process

Machine translation (MT) encompasses a variety of methodologies aimed at enhancing the accuracy of translations. In contrast, the process of human-generated translation relies on a wide range of translation techniques, which are crucial for ensuring linguistic adequacy and fluency. This study suggests that these translation techniques could further optimize machine translation if they are automatically identified before being applied to guide the translation process effectively. The study differentiates between two scenarios of the translation process: from-scratch translation and post-editing. For each scenario, a specific set of experiments has been designed to forecast the most appropriate translation techniques. The findings indicate that the predictive accuracy for from-scratch translation reaches 82%, while the post-editing process exhibits even greater potential, achieving an accuracy rate of 93%.

machine translation, translation, translation technique, (15 more...)

2403.14454

Country:

Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
Africa > Middle East > Egypt > Giza Governorate > Giza (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)