A Rule-Based Approach For Aligning Japanese-Spanish Sentences From A Comparable Corpora

Nov-19-2012–arXiv.org Artificial Intelligence

The performance of a Statistical Machine Translation System (SMT) system is proportionally directed to the quality and length of the parallel corpus it uses. However for some pair of languages there is a considerable lack of them. The long term goal is to construct a Japanese-Spanish parallel corpus to be used for SMT, whereas, there are a lack of useful Japanese-Spanish parallel Corpus. To address this problem, In this study we proposed a method for extracting Japanese-Spanish Parallel Sentences from Wikipedia using POS tagging and Rule-Based approach. The main focus of this approach is the syntactic features of both languages. Human evaluation was performed over a sample and shows promising results, in comparison with the baseline.

artificial intelligence, machine translation, natural language, (15 more...)

arXiv.org Artificial Intelligence

Nov-19-2012

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.14)
- Asia
  - Taiwan (0.14)
  - Japan (0.14)
  - India (0.14)

Genre:
- Research Report > New Finding (0.35)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Rule-Based Reasoning (1.00)
  - Natural Language > Machine Translation (0.90)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found