Chinese Spelling Check with Nearest Neighbors

Yin, Xunjian, Hu, Xinyu, Wan, Xiaojun

arXiv.org Artificial Intelligence 

Chinese Spelling Check (CSC) aims to detect and correct error tokens in Chinese contexts, which has a wide range of applications. In this paper, we introduce InfoKNN-CSC, extending the standard CSC model by linearly interpolating it with a k-nearest neighbors (kNN) model. Moreover, the phonetic, graphic, and contextual information (info) of tokens and contexts are elaborately incorporated into the design of the query and key of kNN, according to the characteristics of the task. After retrieval, in order to match the candidates more accurately, we also perform reranking methods based on the overlap of the n-gram values and inputs. Experiments on the SIGHAN benchmarks demonstrate that the proposed model achieves state-of-the-art performance with substantial improvements over existing work.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found