What does it take to get state of the art in simultaneous speech-to-speech translation?

Sep-14-2024–arXiv.org Artificial Intelligence

This paper presents an in-depth analysis of the latency characteristics observed in simultaneous speech-to-speech model's performance, particularly focusing on hallucination-induced latency spikes. By systematically experimenting with various input parameters and conditions, we propose methods to minimize latency spikes and improve overall performance. The findings suggest that a combination of careful input management and strategic parameter adjustments can significantly enhance speech-to-speech model's latency behavior.

hallucination, latency, translation, (12 more...)

arXiv.org Artificial Intelligence

Sep-14-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre:
- Research Report > New Finding (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Speech > Speech Recognition (1.00)
  - Natural Language > Machine Translation (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.47)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found