Word Alignment in the Era of Deep Learning: A Tutorial
–arXiv.org Artificial Intelligence
The word alignment task, despite its prominence in the era of statistical machine translation (SMT), is niche and under-explored today. In this two-part tutorial, we argue for the continued relevance for word alignment. The first part provides a historical background to word alignment as a core component of the traditional SMT pipeline. We zero-in on GIZA++, an unsupervised, statistical word aligner with surprising longevity. Jumping forward to the era of neural machine translation (NMT), we show how insights from word alignment inspired the attention mechanism fundamental to present-day NMT. The second part shifts to a survey approach. We cover neural word aligners, showing the slow but steady progress towards surpassing GIZA++ performance. Finally, we cover the present-day applications of word alignment, from cross-lingual annotation projection, to improving translation.
arXiv.org Artificial Intelligence
Nov-30-2022
- Country:
- Oceania > Australia (0.04)
- North America
- Dominican Republic (0.04)
- United States
- Maryland > Baltimore (0.04)
- New York (0.04)
- Texas > Travis County
- Austin (0.04)
- Michigan > Washtenaw County
- Ann Arbor (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Colorado > Denver County
- Denver (0.04)
- Hawaii > Honolulu County
- Honolulu (0.04)
- Pennsylvania > Philadelphia County
- Philadelphia (0.14)
- Oregon > Multnomah County
- Portland (0.04)
- Massachusetts > Middlesex County
- Cambridge (0.04)
- New Mexico > Santa Fe County
- Santa Fe (0.04)
- Georgia > Fulton County
- Atlanta (0.04)
- California
- San Diego County > San Diego (0.04)
- Los Angeles County > Long Beach (0.04)
- Canada > British Columbia
- Europe
- Czechia > Prague (0.04)
- Norway (0.04)
- Germany > Berlin (0.04)
- Italy > Tuscany
- Florence (0.04)
- France > Provence-Alpes-Côte d'Azur
- Alpes-Maritimes > Nice (0.04)
- Denmark > Capital Region
- Copenhagen (0.04)
- Portugal > Lisbon
- Lisbon (0.04)
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- Belgium
- Brussels-Capital Region > Brussels (0.04)
- Flanders > Flemish Brabant
- Leuven (0.04)
- Sweden > Vaestra Goetaland
- Gothenburg (0.04)
- United Kingdom > Scotland
- City of Edinburgh > Edinburgh (0.04)
- Asia
- Africa > Middle East
- Tunisia (0.04)
- Egypt > Giza Governorate
- Giza (0.46)
- Genre:
- Overview (0.93)
- Workflow (0.93)
- Instructional Material > Course Syllabus & Notes (0.84)
- Research Report (0.63)
- Technology: