A Simple and Effective Method of Cross-Lingual Plagiarism Detection
Avetisyan, Karen, Malajyan, Arthur, Ghukasyan, Tsolak, Avetisyan, Arutyun
–arXiv.org Artificial Intelligence
We present a simple cross-lingual plagiarism detection method applicable to a large number of languages. The presented approach leverages open multilingual thesauri for candidate retrieval task and pre-trained multilingual BERT-based language models for detailed analysis. The method does not rely on machine translation and word sense disambiguation when in use, and therefore is suitable for a large number of languages, including under-resourced languages. The effectiveness of the proposed approach is demonstrated for several existing and new benchmarks, achieving state-of-the-art results for French, Russian, and Armenian languages.
arXiv.org Artificial Intelligence
Apr-5-2023
- Country:
- Asia
- Europe > Russia
- Central Federal District > Moscow Oblast > Moscow (0.04)
- North America > United States
- New York > New York County > New York City (0.04)
- Genre:
- Research Report (0.64)
- Industry:
- Technology: