AITopics

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.71)

Neural Information Processing SystemsFeb-8-2026, 21:46:00 GMT

DataDiversification: ASimpleStrategyForNeural MachineTranslation

Our method is applicable to all NMT models. It does not require extra monolingual data like back-translation, nor does it add more computations and parameters like ensembles ofmodels.

machine learning, natural language, urlhttp, (18 more...)

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Singapore (0.05)
North America > United States > Texas > Travis County > Austin (0.04)
(8 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.98)

Neural Information Processing SystemsFeb-7-2026, 13:15:36 GMT

1325cdae3b6f0f91a1b629307bf2d498-Supplemental.pdf

C.1 Datasetdescription For WMT'16 English-German experiment, we used the same preprocessed data provided by [31] 1, including the samevalidation(neewsteest2013)andtest (neewsteest2014) splits. The data volume for train, validation and test splits are 4500966, 3000, 3003 sentence pairs respectively. When using LayerDrop we use 50% dropout probability. Similarly,we use beam search with beam size 5and length penalty 1.0 for decoding. First, we show that adding the auxiliary lossLK discretizes the samples and achieve the pruning purpose byenforcingsparsity oftheresulting model.

artificial intelligence, english-german dataset, qt 1, (9 more...)

Technology: Information Technology > Artificial Intelligence (0.52)

Neural Information Processing SystemsFeb-7-2026, 11:39:41 GMT

0f6cc80ad86e553d085842308e0fd2cb-Supplemental-Conference.pdf

gapx, qqp, validation data, (12 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.71)

Neural Information Processing SystemsFeb-7-2026, 06:57:58 GMT

Projection-FreeOnlineConvexOptimizationvia EfficientNewtonIterations

Then,theadversary picks a convex loss functionℓt K R with the knowledge ofHt 1 and the iteratewt, and the learnersuffersloss ℓt(wt)andproceedstothenextround.

algorithm, artificial intelligence, machine learning, (19 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Lee, Seungeon, Das, Soumi, Gupta, Manish, Gummadi, Krishna P.

LoRA on the Go: Instance-level Dynamic LoRA Selection and Merging

arXiv.org Artificial IntelligenceNov-21-2025

Low-Rank Adaptation (LoRA) has emerged as a parameter-efficient approach for fine-tuning large language models. However, conventional LoRA adapters are typically trained for a single task, limiting their applicability in real-world settings where inputs may span diverse and unpredictable domains. At inference time, existing approaches combine multiple LoRAs for improving performance on diverse tasks, while usually requiring labeled data or additional task-specific training, which is expensive at scale. In this work, we introduce LoRA on the Go (LoGo), a training-free framework that dynamically selects and merges adapters at the instance level without any additional requirements. LoGo leverages signals extracted from a single forward pass through LoRA adapters, to identify the most relevant adapters and determine their contributions on-the-fly. Across 5 NLP benchmarks, 27 datasets, and 3 model families, LoGo outperforms training-based baselines on some tasks upto a margin of 3.6% while remaining competitive on other tasks and maintaining inference throughput, highlighting its effectiveness and practicality.

large language model, machine learning, natural language, (20 more...)

2511.07129

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.95)

Neural Information Processing SystemsOct-3-2025, 05:38:20 GMT

Data Diversification: A Simple Strategy For Neural Machine Translation Xuan-Phi Nguyen

Our method is applicable to all NMT models.

artificial intelligence, computational linguistic, natural language, (15 more...)

Country:

Europe (1.00)
North America > United States > Minnesota (0.28)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

DiIanni, Colten, Deutsch, Daniel

Don't Sweat the Small Stuff: Segment-Level Meta-Evaluation Based on Pairwise Difference Correlation

arXiv.org Artificial IntelligenceOct-1-2025

This paper introduces Pairwise Difference Pearson (PDP), a novel segment-level meta-evaluation metric for Machine Translation (MT) that address limitations in previous Pearson's $ρ$-based and and Kendall's $τ$-based meta-evaluation approaches. PDP is a correlation-based metric that utilizes pairwise differences rather than raw scores. It draws on information from all segments for a more robust understanding of score distributions and uses segment-wise pairwise differences to refine Global Pearson to intra-segment score comparisons. Analysis on the WMT'24 shared task shows PDP properly ranks sentinel evaluation metrics and better aligns with human error weightings than previous work. Noise injection analysis demonstrates PDP's robustness to random noise, segment bias, and system bias while highlighting its sensitivity to extreme outliers.

artificial intelligence, machine translation, natural language, (16 more...)

2509.25546

Country:

Asia (0.68)
Europe (0.68)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Kocyigit, Muhammed Yusuf, Briakou, Eleftheria, Deutsch, Daniel, Luo, Jiaming, Cherry, Colin, Freitag, Markus

Overestimation in LLM Evaluation: A Controlled Large-Scale Study on Data Contamination's Impact on Machine Translation

arXiv.org Artificial IntelligenceJan-30-2025

Data contamination -- the accidental consumption of evaluation examples within the pre-training data -- can undermine the validity of evaluation benchmarks. In this paper, we present a rigorous analysis of the effects of contamination on language models at 1B and 8B scales on the machine translation task. Starting from a carefully decontaminated train-test split, we systematically introduce contamination at various stages, scales, and data formats to isolate its effect and measure its impact on performance metrics. Our experiments reveal that contamination with both source and target substantially inflates BLEU scores, and this inflation is 2.5 times larger (up to 30 BLEU points) for 8B compared to 1B models. In contrast, source-only and target-only contamination generally produce smaller, less consistent over-estimations. Finally, we study how the temporal distribution and frequency of contaminated samples influence performance over-estimation across languages with varying degrees of data resources.

contamination, controlled large-scale study, data contamination, (11 more...)

2501.18771

Country:

North America > Mexico (0.04)
Asia > Singapore (0.04)
North America > United States > Pennsylvania (0.04)
(4 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceDec-11-2024

From Jack of All Trades to Master of One: Specializing LLM-based Autoraters to a Test Set

Finkelstein, Mara, Deutsch, Dan, Riley, Parker, Juraska, Juraj, Kovacs, Geza, Freitag, Markus

As LLMs continue to become more powerful and versatile, human evaluation has quickly become intractable at scale and reliance on automatic metrics has become the norm. Recently, it has been shown that LLMs are themselves state-of-the-art evaluators for many tasks. These Autoraters are typically designed so that they generalize to new systems and test sets. In practice, however, evaluation is performed on a small set of fixed, canonical test sets, which are carefully curated to measure certain capabilities of interest and are not changed frequently. In this work, we design a method which specializes a prompted Autorater to a given test set, by leveraging historical ratings on the test set to construct in-context learning (ICL) examples. We evaluate our Specialist method on the task of fine-grained machine translation evaluation, and show that it dramatically outperforms the state-of-the-art XCOMET metric by 54% and 119% on the WMT'23 and WMT'24 test sets, respectively. We perform extensive analyses to understand the representations learned by our Specialist metrics, and how variability in rater behavior affects their performance. We also verify the generalizability and robustness of our Specialist method for designing automatic metrics across different numbers of ICL examples, LLM backbones, systems to evaluate, and evaluation tasks.

icl example, translation, wmt, (14 more...)

2411.15387

Country: North America > Mexico > Mexico City > Mexico City (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)