Back to Patterns: Efficient Japanese Morphological Analysis with Feature-Sequence Trie

May-30-2023–arXiv.org Artificial Intelligence

Accurate neural models are much less efficient than non-neural models and are useless for processing billions of social media posts or handling user queries in real time with a limited budget. This study revisits the fastest pattern-based NLP methods to make them as accurate as possible, thus yielding a strikingly simple yet surprisingly accurate morphological analyzer for Japanese. The proposed method induces reliable patterns from a morphological dictionary and annotated data. Experimental results on two standard datasets confirm that the method exhibits comparable accuracy to learning-based baselines, while boasting a remarkable throughput of over 1,000,000 sentences per second on a single modern CPU. The source code is available at https://www.tkl.iis.u-tokyo.ac.jp/~ynaga/jagger/

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

May-30-2023

arXiv.org PDF

Add feedback

Country:
- Asia > Japan
  - Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.25)
- Europe (1.00)
- North America > United States (0.68)

Genre:
- Research Report (0.50)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks (0.46)
  - Natural Language
    - Grammars & Parsing (0.74)
    - Text Processing (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found