AITopics | refinement technique

Comparing Unsupervised Word Translation Methods Step by Step

Neural Information Processing SystemsDec-26-2025, 00:17:28 GMT

Cross-lingual word vector space alignment is the task of mapping the vocabularies of two languages into a shared semantic space, which can be used for dictionary induction, unsupervised machine translation, and transfer learning. In the unsupervised regime, an initial seed dictionary is learned in the absence of any known correspondences between words, through {\bf distribution matching}, and the seed dictionary is then used to supervise the induction of the final alignment in what is typically referred to as a (possibly iterative) {\bf refinement} step. We focus on the first step and compare distribution matching techniques in the context of language pairs for which mixed training stability and evaluation scores have been reported. We show that, surprisingly, when looking at this initial step in isolation, vanilla GANs are superior to more recent methods, both in terms of precision and robustness. The improvements reported by more recent methods thus stem from the refinement techniques, and we show that we can obtain state-of-the-art performance combining vanilla GANs with such refinement techniques.

name change, proceedings, unsupervised word translation method step, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.78)

Add feedback

Comparing Unsupervised Word Translation Methods Step by Step

Mareike Hartmann, Yova Kementchedjhieva, Anders Søgaard

Neural Information Processing SystemsAug-20-2025, 03:30:47 GMT

Neural Information Processing Systems http://nips.cc/

dictionary induction, proceedings, seed dictionary, (13 more...)

Neural Information Processing Systems

Country:

Europe > Denmark > Capital Region > Copenhagen (0.05)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Genre: Research Report (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Comparing Unsupervised Word Translation Methods Step by Step

Neural Information Processing SystemsOct-11-2024, 00:06:32 GMT

Cross-lingual word vector space alignment is the task of mapping the vocabularies of two languages into a shared semantic space, which can be used for dictionary induction, unsupervised machine translation, and transfer learning. In the unsupervised regime, an initial seed dictionary is learned in the absence of any known correspondences between words, through {\bf distribution matching}, and the seed dictionary is then used to supervise the induction of the final alignment in what is typically referred to as a (possibly iterative) {\bf refinement} step. We focus on the first step and compare distribution matching techniques in the context of language pairs for which mixed training stability and evaluation scores have been reported. We show that, surprisingly, when looking at this initial step in isolation, vanilla GANs are superior to more recent methods, both in terms of precision and robustness. The improvements reported by more recent methods thus stem from the refinement techniques, and we show that we can obtain state-of-the-art performance combining vanilla GANs with such refinement techniques.

refinement technique, seed dictionary, unsupervised word translation method step, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.83)

Add feedback

Hybrid Student-Teacher Large Language Model Refinement for Cancer Toxicity Symptom Extraction

Khanmohammadi, Reza, Ghanem, Ahmed I., Verdecchia, Kyle, Hall, Ryan, Elshaikh, Mohamed, Movsas, Benjamin, Bagher-Ebadian, Hassan, Luo, Bing, Chetty, Indrin J., Alhanai, Tuka, Thind, Kundan, Ghassemi, Mohammad M.

arXiv.org Artificial IntelligenceAug-8-2024

Large Language Models (LLMs) offer significant potential for clinical symptom extraction, but their deployment in healthcare settings is constrained by privacy concerns, computational limitations, and operational costs. This study investigates the optimization of compact LLMs for cancer toxicity symptom extraction using a novel iterative refinement approach. We employ a student-teacher architecture, utilizing Zephyr-7b-beta and Phi3-mini-128 as student models and GPT-4o as the teacher, to dynamically select between prompt refinement, Retrieval-Augmented Generation (RAG), and fine-tuning strategies. Our experiments on 294 clinical notes covering 12 post-radiotherapy toxicity symptoms demonstrate the effectiveness of this approach. The RAG method proved most efficient, improving average accuracy scores from 0.32 to 0.73 for Zephyr-7b-beta and from 0.40 to 0.87 for Phi3-mini-128 during refinement. In the test set, both models showed an approximate 0.20 increase in accuracy across symptoms. Notably, this improvement was achieved at a cost 45 times lower than GPT-4o for Zephyr and 79 times lower for Phi-3.

fine-tuning, student model, symptom extraction, (14 more...)

arXiv.org Artificial Intelligence

2408.04775

Country:

North America > United States > Michigan (0.04)
North America > United States > New York (0.04)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.68)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Comparing Unsupervised Word Translation Methods Step by Step

Hartmann, Mareike, Kementchedjhieva, Yova, Søgaard, Anders

Neural Information Processing SystemsMar-18-2020, 23:01:20 GMT

Cross-lingual word vector space alignment is the task of mapping the vocabularies of two languages into a shared semantic space, which can be used for dictionary induction, unsupervised machine translation, and transfer learning. In the unsupervised regime, an initial seed dictionary is learned in the absence of any known correspondences between words, through {\bf distribution matching}, and the seed dictionary is then used to supervise the induction of the final alignment in what is typically referred to as a (possibly iterative) {\bf refinement} step. We focus on the first step and compare distribution matching techniques in the context of language pairs for which mixed training stability and evaluation scores have been reported. We show that, surprisingly, when looking at this initial step in isolation, vanilla GANs are superior to more recent methods, both in terms of precision and robustness. The improvements reported by more recent methods thus stem from the refinement techniques, and we show that we can obtain state-of-the-art performance combining vanilla GANs with such refinement techniques.

refinement technique, seed dictionary, unsupervised word translation method step, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.88)

Add feedback

Filters

Collaborating Authors

refinement technique

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Comparing Unsupervised Word Translation Methods Step by Step

Comparing Unsupervised Word Translation Methods Step by Step

Comparing Unsupervised Word Translation Methods Step by Step

Hybrid Student-Teacher Large Language Model Refinement for Cancer Toxicity Symptom Extraction

Comparing Unsupervised Word Translation Methods Step by Step