syllabification
Design and Implementation of a Tool for Extracting Uzbek Syllables
Salaev, Ulugbek, Kuriyozov, Elmurod, Matlatipov, Gayrat
The accurate syllabification of words plays a vital role in various Natural Language Processing applications. Syllabification is a versatile linguistic tool with applications in linguistic research, language technology, education, and various fields where understanding and processing language is essential. In this paper, we present a comprehensive approach to syllabification for the Uzbek language, including rule-based techniques and machine learning algorithms. Our rule-based approach utilizes advanced methods for dividing words into syllables, generating hyphenations for line breaks and count of syllables. Additionally, we collected a dataset for evaluating and training using machine learning algorithms comprising word-syllable mappings, hyphenations, and syllable counts to predict syllable counts as well as for the evaluation of the proposed model. Our results demonstrate the effectiveness and efficiency of both approaches in achieving accurate syllabification. The results of our experiments show that both approaches achieved a high level of accuracy, exceeding 99%. This study provides valuable insights and recommendations for future research on syllabification and related areas in not only the Uzbek language itself, but also in other closely-related Turkic languages with low-resource factor.
- Asia > Uzbekistan (0.06)
- South America > Brazil (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)
Revisiting Syllables in Language Modelling and their Application on Low-Resource Machine Translation
Oncevay, Arturo, Rojas, Kervy Dante Rivas, Sanchez, Liz Karen Chavez, Zariquiey, Roberto
Language modelling and machine translation tasks mostly use subword or character inputs, but syllables are seldom used. Syllables provide shorter sequences than characters, require less-specialised extracting rules than morphemes, and their segmentation is not impacted by the corpus size. In this study, we first explore the potential of syllables for open-vocabulary language modelling in 21 languages. We use rule-based syllabification methods for six languages and address the rest with hyphenation, which works as a syllabification proxy. With a comparable perplexity, we show that syllables outperform characters and other subwords. Moreover, we study the importance of syllables on neural machine translation for a non-related and low-resource language-pair (Spanish--Shipibo-Konibo). In pairwise and multilingual systems, syllables outperform unsupervised subwords, and further morphological segmentation methods, when translating into a highly synthetic language with a transparent orthography (Shipibo-Konibo). Finally, we perform some human evaluation, and discuss limitations and opportunities.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > Denmark > Capital Region > Copenhagen (0.05)
- Asia > Myanmar (0.04)
- (20 more...)
Exploiting Syllable Structure in a Connectionist Phonology Model
Touretzky, David S., Wheeler, Deirdre W.
In a previous paper (Touretzky & Wheeler, 1990a) we showed how adding a clustering operation to a connectionist phonology model produced a parallel processing accountof certain "iterative" phenomena. In this paper we show how the addition of a second structuring primitive, syllabification, greatly increases the power of the model. We present examples from a non-Indo-European language that appear to require rule ordering to at least a depth of four. By adding syllabification circuitryto structure the model's perception of the input string, we are able to handle these examples with only two derivational steps. We conclude that in phonology, derivation can be largely replaced by structuring.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.05)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- (3 more...)
Exploiting Syllable Structure in a Connectionist Phonology Model
Touretzky, David S., Wheeler, Deirdre W.
In a previous paper (Touretzky & Wheeler, 1990a) we showed how adding a clustering operation to a connectionist phonology model produced a parallel processing account of certain "iterative" phenomena. In this paper we show how the addition of a second structuring primitive, syllabification, greatly increases the power of the model. We present examples from a non-Indo-European language that appear to require rule ordering to at least a depth of four. By adding syllabification circuitry to structure the model's perception of the input string, we are able to handle these examples with only two derivational steps. We conclude that in phonology, derivation can be largely replaced by structuring.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.05)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- (3 more...)
Exploiting Syllable Structure in a Connectionist Phonology Model
Touretzky, David S., Wheeler, Deirdre W.
In a previous paper (Touretzky & Wheeler, 1990a) we showed how adding a clustering operation to a connectionist phonology model produced a parallel processing account of certain "iterative" phenomena. In this paper we show how the addition of a second structuring primitive, syllabification, greatly increases the power of the model. We present examples from a non-Indo-European language that appear to require rule ordering to at least a depth of four. By adding syllabification circuitry to structure the model's perception of the input string, we are able to handle these examples with only two derivational steps. We conclude that in phonology, derivation can be largely replaced by structuring.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.05)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- (3 more...)