AITopics | simple strategy

Collaborating Authors

simple strategy

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Data Diversification: A Simple Strategy For Neural Machine Translation

Neural Information Processing SystemsDec-24-2025, 04:21:45 GMT

We introduce Data Diversification: a simple but effective strategy to boost neural machine translation (NMT) performance. It diversifies the training data by using the predictions of multiple forward and backward models and then merging them with the original dataset on which the final NMT model is trained. Our method is applicable to all NMT models. It does not require extra monolingual data like back-translation, nor does it add more computations and parameters like ensembles of models. Our method achieves state-of-the-art BLEU scores of 30.7 and 43.7 in the WMT'14 English-German and English-French translation tasks, respectively. It also substantially improves on 8 other translation tasks: 4 IWSLT tasks (English-German and English-French) and 4 low-resource translation tasks (English-Nepali and English-Sinhala). We demonstrate that our method is more effective than knowledge distillation and dual learning, it exhibits strong correlation with ensembles of models, and it trades perplexity off for better BLEU score.

data diversification, neural machine translation, simple strategy, (7 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.78)

Add feedback

Simple strategies for recovering inner products from coarsely quantized random projections

Neural Information Processing SystemsNov-21-2025, 16:13:03 GMT

Random projections have been increasingly adopted for a diverse set of tasks in machine learning involving dimensionality reduction. One specific line of research on this topic has investigated the use of quantization subsequent to projection with the aim of additional data compression. Motivated by applications in nearest neighbor search and linear learning, we revisit the problem of recovering inner products (respectively cosine similarities) in such setting. We show that even under coarse scalar quantization with 3 to 5 bits per projection, the loss in accuracy tends to range from moderate''. One implication is that in most scenarios of practical interest, there is no need for a sophisticated recovery approach like maximum likelihood estimation as considered in previous work on the subject. What we propose herein also yields considerable improvements in terms of accuracy over the Hamming distance-based approach in Li et al. (ICML 2014) which is comparable in terms of simplicity

inner product, projection, simple strategy, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.61)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.61)

Add feedback

Review for NeurIPS paper: Data Diversification: A Simple Strategy For Neural Machine Translation

Neural Information Processing SystemsJan-25-2025, 17:00:21 GMT

Weaknesses: While the described approach is simple and very generally applicable, there are some major issues with the evaluation that need to be addressed. If 1. and 2. are addressed I would be willing to update my scores. The BLEU evaluation is not clearly described for the WMT and IWSLT experiments. Given the major variations observed in BLEU scores due to differences in post-processing or the BLEU evaluation script used, it's hard to fairly compare against previous work without clearly describing the post-processing, tokenization and BLEU evaluation tool used for these experiments. Since the proposed method relies heavily on using backward and forward translated data, these effects are bound to affect the observed BLEU improvements.

data diversification, neural machine translation, simple strategy, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Review for NeurIPS paper: Data Diversification: A Simple Strategy For Neural Machine Translation

Neural Information Processing SystemsJan-25-2025, 17:00:14 GMT

This work describes a simple approach to synthetically augment the training dataset for neural machine translation. The proposed approach involves training multiple forward and backward MT models and appending their outputs on the original training dataset to the training data. This augmented (or diversified) training dataset can then be used to train the next generation of models. The proposed approach is simple, achieves good results, and the authors do a good job presenting the idea. The paper is quite empirical and the technique fairly specific to NMT, but it is still interesting to see that sometimes simple ideas work well and are thus important / deserve careful consideration.

data diversification, neural machine translation, training dataset, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Data Diversification: A Simple Strategy For Neural Machine Translation

Neural Information Processing SystemsOct-10-2024, 12:37:24 GMT

data diversification, neural machine translation, translation task, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Reviews: Simple strategies for recovering inner products from coarsely quantized random projections

Neural Information Processing SystemsOct-8-2024, 12:28:51 GMT

Random projections are often used in learning tasks involving dimensionality reduction. The goal of the additional quantization step is data compression that allows for a reduction in space complexity of learning algorithms and more efficient communication in distributed settings.

inner product, projection, random projection, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.50)

Add feedback

Ask Me Anything: A simple strategy for prompting language models

#artificialintelligenceOct-7-2022, 00:10:16 GMT

Large language models (LLMs) transfer well to new tasks out-of-the-box simply given a natural language prompt that demonstrates how to perform the task and no additional training. Prompting is a brittle process wherein small modifications to the prompt can cause large variations in the model predictions, and therefore significant effort is dedicated towards designing a painstakingly "perfect prompt" for a task. To mitigate the high degree of effort involved in prompt-design, we instead ask whether producing multiple effective, yet imperfect, prompts and aggregating them can lead to a high quality prompting strategy. Our observations motivate our proposed prompting method, ASK ME ANYTHING (AMA). We first develop an understanding of the effective prompt formats, finding that question-answering (QA) prompts, which encourage open-ended generation ("Who went to the park?") tend to outperform those that restrict the model outputs ("John went to the park. Output True or False."). Our approach recursively uses the LLM itself to transform task inputs to the effective QA format. We apply the collected prompts to obtain several noisy votes for the input's true label. We find that the prompts can have very different accuracies and complex dependencies and thus propose to use weak supervision, a procedure for combining the noisy predictions, to produce the final predictions for the inputs. We evaluate AMA across open-source model families (e.g., EleutherAI, BLOOM, OPT, and T0) and model sizes (125M-175B parameters), demonstrating an average performance lift of 10.2% over the few-shot baseline. This simple strategy enables the open-source GPT-J-6B model to match and exceed the performance of few-shot GPT3-175B on 15 of 20 popular benchmarks. Averaged across these tasks, the GPT-Neo-6B model outperforms few-shot GPT3-175B. We release our code here: https://github.com/HazyResearch/ama_prompting

language model, simple strategy

#artificialintelligence

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.53)

Add feedback

Simple strategies for recovering inner products from coarsely quantized random projections

Li, Ping, Slawski, Martin

Neural Information Processing SystemsFeb-14-2020, 15:42:40 GMT

Random projections have been increasingly adopted for a diverse set of tasks in machine learning involving dimensionality reduction. One specific line of research on this topic has investigated the use of quantization subsequent to projection with the aim of additional data compression. Motivated by applications in nearest neighbor search and linear learning, we revisit the problem of recovering inner products (respectively cosine similarities) in such setting. We show that even under coarse scalar quantization with 3 to 5 bits per projection, the loss in accuracy tends to range from negligible'' to moderate''. One implication is that in most scenarios of practical interest, there is no need for a sophisticated recovery approach like maximum likelihood estimation as considered in previous work on the subject.

Add feedback