An Analysis of Letter Dynamics in the English Alphabet
–arXiv.org Artificial Intelligence
The tabulation of commonly used letters, as determined by letter frequency, was later utilized to improve typewriter keyboard arrangement by minimizing hand motion [5]. Statistical characteristics of different letters of the English alphabet was further studied in the context of different sentence structures [6]. The letters'B', 'S', 'M', 'H', 'C' were found to most frequently occur as the initial letters of proper nouns, while'E', 'A', 'R', 'N' were the most frequently used letters when the entire proper noun is considered. For entire text documents, the most commonly used letters were found to be'E', 'T', 'A', 'O', 'N'. Interestingly, 95% of the English vocabulary was found to be represented by 13 letters of the alphabet. Our manuscript expanded upon the statistical study of the English alphabet by evaluating letter frequency in the context of different categories of writings. We analyzed news articles, novels, plays, and scientific articles for letter frequency and distribution. As a result, we determined the information density of the letters of the alphabet. Additionally, we developed a metric called "distance, d" to act as a simple algorithm for recognizing writing category.
arXiv.org Artificial Intelligence
Jan-27-2024
- Country:
- Asia > Indonesia (0.04)
- Europe > Italy (0.04)
- North America
- Mexico (0.04)
- United States
- California > Alameda County
- San Leandro (0.04)
- Massachusetts > Middlesex County
- Natick (0.04)
- Mississippi (0.04)
- Oregon (0.04)
- Pennsylvania
- Northampton County > Bethlehem (0.04)
- Philadelphia County > Philadelphia (0.04)
- California > Alameda County
- South America > Brazil (0.04)
- Genre:
- Research Report (0.64)
- Industry:
- Energy (1.00)
- Health & Medicine
- Materials > Chemicals (1.00)
- Water & Waste Management > Water Management
- Constituents > Bacteria (0.46)
- Technology: