AITopics

TransVIP: Speech to Speech Translation System with Voice and Isochrony Preservation

Neural Information Processing SystemsMay-25-2025, 12:12:06 GMT

There is a rising interest and trend in research towards directly translating speech from one language to another, known as end-to-end speech-to-speech translation. However, most end-to-end models struggle to outperform cascade models, i.e., a pipeline framework by concatenating speech recognition, machine translation, and text-to-speech models. The primary challenges stem from the inherent complexities involved in direct translation tasks and the scarcity of data. In this study, we introduce a novel model framework TransVIP that leverages diverse datasets in a cascade fashion yet facilitates end-to-end inference through joint probability. Furthermore, we propose two separate encoders to preserve the speaker's voice characteristics and isochrony from the source speech during the translation process, making it highly suitable for scenarios such as video dubbing. Our experiments on the French-English language pair demonstrate that our model outperforms the current state-of-the-art speech-to-speech translation model.

artificial intelligence, machine translation, natural language, (19 more...)

Neural Information Processing Systems

Country:

Europe > France (0.14)
North America > Canada (0.14)
Asia > China (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

cdaac2a02c4fdcae77ba083b110efcc3-Paper-Conference.pdf

Neural Information Processing SystemsMay-25-2025, 12:11:58 GMT

arxiv preprint arxiv, large language model, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Asia > Middle East (0.46)
Europe (0.46)

Genre:

Research Report > New Finding (1.00)
Research Report > Promising Solution (0.87)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Appendix

Neural Information Processing SystemsMay-25-2025, 12:11:43 GMT

B: GPT-3's MSE averaged over trials for each task. D: GPT-3's prior expectations across tasks (blue) compared to the true task distribution (orange). We added task similarity as a regressor on Experiment 1's and 2's respective MSE/regret regression bar plots (Figures 7A and 7B). For Experiment 1, we quantified task similarity using the average negative L2 norm of the underlying parameters (slope & intercept) with previous tasks. For Experiment 2, we quantified task similarity using the average difference of mean rewards with previous tasks.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Genre:

Research Report > New Finding (0.30)
Research Report > Experimental Study (0.30)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.60)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Meta-in-context learning in large language models Julian Coda-Forno 1,2, Marcel Binz 1 Matthew Botvinick 3

Neural Information Processing SystemsMay-25-2025, 12:11:39 GMT

Large language models have shown tremendous performance in a variety of tasks. In-context learning - the ability to improve at a task after being provided with a number of demonstrations - is seen as one of the main contributors to their success. In the present paper, we demonstrate that the in-context learning abilities of large language models can be recursively improved via in-context learning itself.

large language model, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country: Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

cd86a30526cd1aff61d6f89f107634e4-Supplemental-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsMay-25-2025, 12:09:22 GMT

annotator, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

North America > United States (1.00)
Asia (1.00)
Africa (0.93)
Europe (0.93)

Genre: Research Report (0.67)

Industry:

Media > News (1.00)
Law (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Health & Medicine > Epidemiology (0.68)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(2 more...)

Add feedback

C: A Dataset for Real-world Claim Verification with Evidence from the Web

Neural Information Processing SystemsMay-25-2025, 12:09:18 GMT

Existing datasets for automated fact-checking have substantial limitations, such as relying on artificial claims, lacking annotations for evidence and intermediate reasoning, or including evidence published after the claim.

annotator, large language model, machine learning, (21 more...)

Neural Information Processing Systems

Country:

Africa (1.00)
Asia > Middle East (0.67)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report > Experimental Study (0.67)

Industry:

Media > News (1.00)
Law (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
(3 more...)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(4 more...)

Add feedback

End-To-End Latent Variational Diffusion Models for Inverse Problems in High Energy Physics

Neural Information Processing SystemsMay-25-2025, 12:08:55 GMT

High-energy collisions at the Large Hadron Collider (LHC) provide valuable insights into open questions in particle physics. However, detector effects must be corrected before measurements can be compared to certain theoretical predictions or measurements from other detectors. Methods to solve this inverse problem of mapping detector observations to theoretical quantities of the underlying collision are essential parts of many physics analyses at the LHC. We investigate and compare various generative deep learning methods to approximate this inverse mapping. We introduce a novel unified architecture, termed latent variational diffusion models, which combines the latent learning of cutting-edge generative art approaches with an end-to-end variational framework. We demonstrate the effectiveness of this approach for reconstructing global distributions of theoretical kinematic quantities, as well as for ensuring the adherence of the learned posterior distributions to known physics constraints. Our unified approach achieves a distribution-free distance to the truth of over 20 times smaller than non-latent state-of-the-art baseline and 3 times smaller than traditional latent diffusion models.

artificial intelligence, generative model, machine learning, (18 more...)

Neural Information Processing Systems

Country: