Predicting Punctuation in Ancient Chinese Texts: A Multi-Layered LSTM and Attention-Based Approach
Cai, Tracy, Chang, Kimmy, Nabi, Fahad
–arXiv.org Artificial Intelligence
In fact, Previous approaches have experimented with many ancient Chinese texts contain thousands of Encoder-Decoder RNNs, GRU, and LSTMs as well lines with no distinct punctuation marks or delimiters as different single-headed attention structures (local in sight. The lack of punctuation in such texts and global) to successfully conduct language makes it difficult for humans to identify when there translation tasks. One recent work that built an pauses or breaks between particular phrases and efficient model for optimal performance in a task understand the semantic meaning of the written similar to ours (predicting line breaks) is that of text (Mogahed, 2012). As a result, unless one was Oh et al. In Oh et al (2017), researchers were able educated in the ancient time period, many readers to predict where line breaks ought to be in Hanmun of ancient Chinese would have significantly different (a punctuation-lacking Korean script) with a interpretations of the texts. We propose an approach multi-layered LSTM model that incorporated an to predict the location (and type) of punctuation end-of-sentence attention mechanism. As Luong in ancient Chinese texts that extends the work et al. (2015) found local attention models to significantly of Oh et al (2017) by leveraging a bidirectional outperform non-attentional ones on translation multi-layered LSTM with a multi-head attention tasks between English-German, we were inspired mechanism as inspired by Luong et al.'s (2015) discussion to improve upon Oh et al.'s approach towards of attention-based architectures. We find line-break prediction by paying special attention to that the use of multi-layered LSTMs and multihead the attention model.
arXiv.org Artificial Intelligence
Sep-16-2024
- Country:
- Asia > China (0.04)
- North America > United States
- California > Santa Clara County > Palo Alto (0.05)
- Genre:
- Research Report (0.65)
- Technology: