WYWEB: A NLP Evaluation Benchmark For Classical Chinese

Zhou, Bo, Chen, Qianglong, Wang, Tianyu, Zhong, Xiaomi, Zhang, Yin

May-23-2023–arXiv.org Artificial Intelligence

To fully evaluate the overall performance of different NLP models in a given domain, many evaluation benchmarks are proposed, such as GLUE, SuperGLUE and CLUE. The fi eld of natural language understanding has traditionally focused on benchmarks for various tasks in languages such as Chinese, English, and multilingua, however, there has been a lack of attention given to the area of classical Chinese, also known as "wen yan wen", which has a rich history spanning thousands of years and holds signifi cant cultural and academic value. For the prosperity of the NLP community, in this paper, we introduce the WYWEB evaluation benchmark, which consists of nine NLP tasks in classical Chinese, implementing sentence classifi cation, sequence labeling, reading comprehension, and machine translation. We evaluate the existing pre-trained language models, which are all struggling with this benchmark. We also introduce a number of supplementary datasets and additional tools to help facilitate further progress on classical Chinese NLU. The github repository is https://github.com/baudzhou/WYWEB.

benchmark, classical chinese, dataset, (14 more...)

arXiv.org Artificial Intelligence

May-23-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - New York > New York County
    - New York City (0.04)
  - Minnesota > Hennepin County
    - Minneapolis (0.14)
- Europe
  - Spain > Catalonia
    - Barcelona Province > Barcelona (0.04)
  - Ireland > Leinster
    - County Dublin > Dublin (0.04)
  - France > Provence-Alpes-Côte d'Azur
    - Bouches-du-Rhône > Marseille (0.04)
- Asia
  - Vietnam (0.04)
  - East Asia (0.04)
  - Japan > Honshū
    - Tōhoku > Iwate Prefecture > Morioka (0.04)
  - China
    - Zhejiang Province > Hangzhou (0.04)
    - Beijing > Beijing (0.04)

Genre:
- Research Report (0.64)

Industry:
- Government (0.93)
- Education
  - Educational Setting > Higher Education (0.46)
  - Assessment & Standards > Student Performance (0.35)

Technology:
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found