Adaptative Bilingual Aligning Using Multilingual Sentence Embedding
–arXiv.org Artificial Intelligence
In this paper, we present an adaptive bitextual alignment system called AIlign. This aligner relies on sentence embeddings to extract reliable anchor points that can guide the alignment path, even for texts whose parallelism is fragmentary and not strictly monotonic. In an experiment on several datasets, we show that AIlign achieves results equivalent to the state of the art, with quasi-linear complexity. In addition, AIlign is able to handle texts whose parallelism and monotonicity properties are only satisfied locally, unlike recent systems such as Vecalign or Bertalign.
arXiv.org Artificial Intelligence
Mar-18-2024
- Country:
- Oceania > Australia
- North America
- United States
- Ohio > Franklin County
- Columbus (0.04)
- Colorado > Denver County
- Denver (0.04)
- Ohio > Franklin County
- Canada > Quebec
- Montreal (0.04)
- United States
- Europe
- Slovenia (0.04)
- Netherlands
- South Holland > Dordrecht (0.04)
- North Holland > Amsterdam (0.04)
- Middle East > Malta
- Port Region > Southern Harbour District > Valletta (0.04)
- Italy > Tuscany
- Florence (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- France
- Provence-Alpes-Côte d'Azur > Alpes-Maritimes
- Nice (0.04)
- Auvergne-Rhône-Alpes > Isère
- Grenoble (0.05)
- Provence-Alpes-Côte d'Azur > Alpes-Maritimes
- Asia > China
- Genre:
- Research Report (0.64)
- Technology: