Back to Patterns: Efficient Japanese Morphological Analysis with Feature-Sequence Trie
–arXiv.org Artificial Intelligence
Accurate neural models are much less efficient than non-neural models and are useless for processing billions of social media posts or handling user queries in real time with a limited budget. This study revisits the fastest pattern-based NLP methods to make them as accurate as possible, thus yielding a strikingly simple yet surprisingly accurate morphological analyzer for Japanese. The proposed method induces reliable patterns from a morphological dictionary and annotated data. Experimental results on two standard datasets confirm that the method exhibits comparable accuracy to learning-based baselines, while boasting a remarkable throughput of over 1,000,000 sentences per second on a single modern CPU. The source code is available at https://www.tkl.iis.u-tokyo.ac.jp/~ynaga/jagger/
arXiv.org Artificial Intelligence
May-30-2023
- Country:
- Asia
- Europe
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Bulgaria > Varna Province
- Varna (0.04)
- Greece > Attica
- Athens (0.04)
- Ireland (0.04)
- Netherlands > South Holland
- Dordrecht (0.04)
- Portugal > Lisbon
- Lisbon (0.04)
- Spain > Galicia
- Madrid (0.04)
- Sweden
- Uppsala County > Uppsala (0.04)
- Vaestra Goetaland > Gothenburg (0.04)
- Belgium > Brussels-Capital Region
- North America > United States
- Hawaii > Honolulu County
- Honolulu (0.04)
- Oregon > Multnomah County
- Portland (0.04)
- Hawaii > Honolulu County
- Genre:
- Research Report (0.50)
- Technology: