AITopics | rerank

Collaborating Authors

rerank

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Flat Object Retrieval Benchmark for Universal Image Embedding Supplementary Material

Neural Information Processing SystemsOct-8-2025, 16:30:26 GMT

Figure 1: Example images of animated trading card .

artificial intelligence, machine learning, natural language, (14 more...)

Neural Information Processing Systems

Country: Europe > Czechia > Prague (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.68)

Add feedback

Adaptive Few-shot Prompting for Machine Translation with Pre-trained Language Models

Tang, Lei, Qin, Jinghui, Ye, Wenxuan, Tan, Hao, Yang, Zhijing

arXiv.org Artificial IntelligenceJan-3-2025

Recently, Large language models (LLMs) with in-context learning have demonstrated remarkable potential in handling neural machine translation. However, existing evidence shows that LLMs are prompt-sensitive and it is sub-optimal to apply the fixed prompt to any input for downstream machine translation tasks. To address this issue, we propose an adaptive few-shot prompting (AFSP) framework to automatically select suitable translation demonstrations for various source input sentences to further elicit the translation capability of an LLM for better machine translation. First, we build a translation demonstration retrieval module based on LLM's embedding to retrieve top-k semantic-similar translation demonstrations from aligned parallel translation corpus. Rather than using other embedding models for semantic demonstration retrieval, we build a hybrid demonstration retrieval module based on the embedding layer of the deployed LLM to build better input representation for retrieving more semantic-related translation demonstrations. Then, to ensure better semantic consistency between source inputs and target outputs, we force the deployed LLM itself to generate multiple output candidates in the target language with the help of translation demonstrations and rerank these candidates. Besides, to better evaluate the effectiveness of our AFSP framework on the latest language and extend the research boundary of neural machine translation, we construct a high-quality diplomatic Chinese-English parallel dataset that consists of 5,528 parallel Chinese-English sentences. Finally, extensive experiments on the proposed diplomatic Chinese-English parallel dataset and the United Nations Parallel Corpus (Chinese-English part) show the effectiveness and superiority of our proposed AFSP.

large language model, machine learning, translation, (18 more...)

arXiv.org Artificial Intelligence

2501.01679

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > India (0.04)
North America > Canada > Ontario > Toronto (0.04)
(4 more...)

Genre: Research Report (0.40)

Industry: Government (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Torque-Aware Momentum

Malviya, Pranshu, Mordido, Goncalo, Baratin, Aristide, Harikandeh, Reza Babanezhad, Dziugaite, Gintare Karolina, Pascanu, Razvan, Chandar, Sarath

arXiv.org Artificial IntelligenceDec-25-2024

Efficiently exploring complex loss landscapes is key to the performance of deep neural networks. While momentum-based optimizers are widely used in stateof-the-art setups, classical momentum can still struggle with large, misaligned gradients, leading to oscillations. To address this, we propose Torque-Aware Momentum (TAM), which introduces a damping factor based on the angle between the new gradients and previous momentum, stabilizing the update direction during training. Empirical results show that TAM, which can be combined with both SGD and Adam, enhances exploration, handles distribution shifts more effectively, and improves generalization performance across various tasks, including image classification and large language model fine-tuning, when compared to classical momentum-based optimizers. Despite the wide range of optimization methods available in the literature, stochastic gradient descent (SGD), typically augmented with momentum (Kingma & Ba, 2015; Nesterov, 1983; Qian, 1999), remains the go-to approach for practitioners. Momentum accelerates convergence, particularly in the presence of high curvature (Cutkosky & Mehta, 2020b), small but consistent gradients, or noisy gradients. It also helps the optimizer navigate the loss landscape and escape local minima or saddle points by maintaining consistent updates directions (Jin et al., 2018). While SGD with momentum (SGDM) has shown remarkable success in various scenarios, particularly in computer vision (Sutskever et al., 2013), it remains vulnerable to In this work, we propose that minimizing the influence of misaligned gradients during momentum updates can preserve valuable information and improve the exploration Figure 1: Comparing momentum updates capabilities of momentum-based methods. To enable more obtained using SGDM and TAM consistent exploration of the loss landscape, particularly in for a given SGD trajectory.

machine learning, natural language, sgdm, (18 more...)

arXiv.org Artificial Intelligence

2412.1879

Country:

North America > Canada > Quebec > Montreal (0.04)
Europe > Russia (0.04)
Asia > Russia (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.86)

Add feedback

Tomato, Tomahto, Tomate: Measuring the Role of Shared Semantics among Subwords in Multilingual Language Models

Zhang, Xinyu, Lu, Jing, Tran, Vinh Q., Schuster, Tal, Metzler, Donald, Lin, Jimmy

arXiv.org Artificial IntelligenceNov-7-2024

Human understanding of language is robust to different word choices as far as they represent similar semantic concepts. To what extent does our human intuition transfer to language models, which represent all subwords as distinct embeddings? In this work, we take an initial step on measuring the role of shared semantics among subwords in the encoder-only multilingual language models (mLMs). To this end, we form "semantic tokens" by merging the semantically similar subwords and their embeddings, and evaluate the updated mLMs on 5 heterogeneous multilingual downstream tasks. Results show that the general shared semantics could get the models a long way in making the predictions on mLMs with different tokenizers and model sizes. Inspections on the grouped subwords show that they exhibit a wide range of semantic similarities, including synonyms and translations across many languages and scripts. Lastly, we found the zero-shot results with semantic tokens are on par or even better than the original models on certain classification tasks, suggesting that the shared subword-level semantics may serve as the anchors for cross-lingual transferring.

computational linguistic, miracl, subword, (15 more...)

arXiv.org Artificial Intelligence

2411.0453

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Singapore (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
(8 more...)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.88)

Add feedback

Piccolo2: General Text Embedding with Multi-task Hybrid Loss Training

Huang, Junqin, Hu, Zhongjie, Jing, Zihao, Gao, Mengya, Wu, Yichao

arXiv.org Artificial IntelligenceMay-11-2024

Text embedding models play a pivotal role in natural language processing and machine learning. By encoding texts into structured numerical representations, known as text embeddings, these models encapsulate semantic and contextual information of words, phrases, or entire documents within a dense, lowdimensional vector space [27]. Such embeddings are indispensable for various downstream NLP tasks, including classification, clustering, retrieval, and sentence similarity. Contrastive learning stands out as the most effective technique for training text embedding models [6]. It presents text semantic representations by minimizing the distance between positive pairs and maximizing the distance between negative pairs. Beyond its application in natural language processing (NLP), contrastive learning also proves pivotal in visual [8] [5] and multi-modal [25] representation learning. Recent advanced text embedding works [36] [33] [18] primarily rely on a two-stage pretrain-finetune pipeline to acquire general text embedding models. Pre-training utilizes weakly supervised data sourced from large-scale crawling efforts, while fine-tuning involves refining the model with high-quality text pairs obtained through data mining or manual annotation.

arxiv preprint arxiv, dataset, dimension, (14 more...)

arXiv.org Artificial Intelligence

2405.06932

Country: Asia > Middle East > Israel (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Exploring Human-Like Translation Strategy with Large Language Models

He, Zhiwei, Liang, Tian, Jiao, Wenxiang, Zhang, Zhuosheng, Yang, Yujiu, Wang, Rui, Tu, Zhaopeng, Shi, Shuming, Wang, Xing

arXiv.org Artificial IntelligenceNov-29-2023

Large language models (LLMs) have demonstrated impressive capabilities in general scenarios, exhibiting a level of aptitude that approaches, in some aspects even surpasses, human-level intelligence. Among their numerous skills, the translation abilities of LLMs have received considerable attention. Compared to typical machine translation that focuses solely on source-to-target mapping, LLM-based translation can potentially mimic the human translation process which might take preparatory steps to ensure high-quality translation. This work explores this possibility by proposing the MAPS framework, which stands for Multi-Aspect Prompting and Selection. Specifically, we enable LLMs first to analyze the given source sentence and induce three aspects of translation-related knowledge: keywords, topics, and relevant demonstrations to guide the final translation process. Moreover, we employ a selection mechanism based on quality estimation to filter out noisy and unhelpful knowledge. Both automatic (3 LLMs x 11 directions x 2 automatic metrics) and human evaluation (preference study and MQM) demonstrate the effectiveness of MAPS. Further analysis shows that by mimicking the human translation process, MAPS reduces various translation errors such as hallucination, ambiguity, mistranslation, awkward style, untranslated text, and omission. Source code is available at https://github.com/zwhe99/MAPS-mt.

arxiv preprint, knowledge, translation, (14 more...)

arXiv.org Artificial Intelligence

2305.04118

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
Asia > China > Shanghai > Shanghai (0.04)
North America > Canada > Ontario > Toronto (0.04)
(4 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Unsupervised Lexical Substitution with Decontextualised Embeddings

Wada, Takashi, Baldwin, Timothy, Matsumoto, Yuji, Lau, Jey Han

arXiv.org Artificial IntelligenceSep-16-2022

We propose a new unsupervised method for lexical substitution using pre-trained language models. Compared to previous approaches that use the generative capability of language models to predict substitutes, our method retrieves substitutes based on the similarity of contextualised and decontextualised word embeddings, i.e. the average contextual representation of a word in multiple contexts. We conduct experiments in English and Italian, and show that our method substantially outperforms strong baselines and establishes a new state-of-the-art without any explicit supervision or fine-tuning. We further show that our method performs particularly well at predicting low-frequency substitutes, and also generates a diverse list of substitute candidates, reducing morphophonetic or morphosyntactic biases induced by article-noun agreement.

machine learning, natural language, substitute, (19 more...)

arXiv.org Artificial Intelligence

2209.08236

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Italy > Tuscany > Florence (0.04)
North America > United States > Colorado > Denver County > Denver (0.04)
(7 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)

Add feedback

Reader-Guided Passage Reranking for Open-Domain Question Answering

Mao, Yuning, He, Pengcheng, Liu, Xiaodong, Shen, Yelong, Gao, Jianfeng, Han, Jiawei, Chen, Weizhu

arXiv.org Artificial IntelligenceJan-1-2021

Current open-domain question answering (QA) systems often follow a Retriever-Reader (R2) architecture, where the retriever first retrieves relevant passages and the reader then reads the retrieved passages to form an answer. In this paper, we propose a simple and effective passage reranking method, Reader-guIDEd Reranker (Rider), which does not involve any training and reranks the retrieved passages solely based on the top predictions of the reader before reranking. We show that Rider, despite its simplicity, achieves 10 to 20 absolute gains in top-1 retrieval accuracy and 1 to 4 Exact Match (EM) score gains without refining the retriever or reader. In particular, Rider achieves 48.3 EM on the Natural Questions dataset and 66.4 on the TriviaQA dataset when only 1,024 tokens (7.8 passages on average) are used as the reader input.

arxiv preprint arxiv, prediction, retrieval accuracy, (11 more...)

arXiv.org Artificial Intelligence

2101.00294

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Illinois > Champaign County > Urbana (0.04)
North America > Canada (0.04)
(3 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.64)

Add feedback

BioNLP-OST 2019 RDoC Tasks: Multi-grain Neural Relevance Ranking Using Topics and Attention Based Query-Document-Sentence Interactions

Chaudhary, Yatin, Gupta, Pankaj, Schütze, Hinrich

arXiv.org Machine LearningOct-2-2019

This paper presents our system details and results of participation in the RDoC Tasks of BioNLP-OST 2019. Research Domain Criteria (RDoC) construct is a multi-dimensional and broad framework to describe mental health disorders by combining knowledge from genomics to behaviour. Non-availability of RDoC labelled dataset and tedious labelling process hinders the use of RDoC framework to reach its full potential in Biomedical research community and Healthcare industry. Therefore, Task-1 aims at retrieval and ranking of PubMed abstracts relevant to a given RDoC construct and Task-2 aims at extraction of the most relevant sentence from a given PubMed abstract. We investigate (1) attention based supervised neural topic model and SVM for retrieval and ranking of PubMed abstracts and, further utilize BM25 and other relevance measures for re-ranking, (2) supervised and unsupervised sentence ranking models utilizing multi-view representations comprising of query-aware attention-based sentence representation (QAR), bag-of-words (BoW) and TF-IDF. Our best systems achieved 1st rank and scored 0.86 mean average precision (mAP) and 0.58 macro average accuracy (MAA) in Task-1 and Task-2 respectively.

rdoc construct, representation, rerank, (14 more...)

arXiv.org Machine Learning

1910.00314

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.05)
North America > United States > Nevada (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
(9 more...)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Filters

Collaborating Authors

rerank

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

506630e4a43bb9d64a49f98b9ba934e9-Supplemental-Datasets_and_Benchmarks.pdf

A Flat Object Retrieval Benchmark for Universal Image Embedding Supplementary Material

Adaptive Few-shot Prompting for Machine Translation with Pre-trained Language Models

Torque-Aware Momentum

Tomato, Tomahto, Tomate: Measuring the Role of Shared Semantics among Subwords in Multilingual Language Models

Piccolo2: General Text Embedding with Multi-task Hybrid Loss Training

Exploring Human-Like Translation Strategy with Large Language Models

Unsupervised Lexical Substitution with Decontextualised Embeddings

Reader-Guided Passage Reranking for Open-Domain Question Answering

BioNLP-OST 2019 RDoC Tasks: Multi-grain Neural Relevance Ranking Using Topics and Attention Based Query-Document-Sentence Interactions