Perception of Phonological Assimilation by Neural Speech Recognition Models

Pouw, Charlotte, Kloots, Marianne de Heer, Alishahi, Afra, Zuidema, Willem

Jun-21-2024–arXiv.org Artificial Intelligence

Any speech recognition system must learn to recognize the intended words regardless of the various ways in which those words may be pronounced. A substantial amount of the variability in speech is systematic, arising from phonological processes occurring in predictable environments. One such process is place assimilation, where phonemes adopt the articulation place of adjacent phonemes. For instance, the word pair clean pan is frequently pronounced as clea[m] pan, with the wordfinal coronal /n/ in clean assimilating to the subsequent labial [p] in pan. This is a simple yet common phonological process across the world's languages (Hura, Lindblom, and Diehl 1992). In English, it occurs for coronal segments (e.g., /t/, /d/, /n/) that are followed by noncoronals, such as labials (e.g., [p], [b], [m]) or velars (e.g., [k], [g], [N]). Human listeners are able to infer the underlying /n/ when exposed to assimilated inputs like clea[m] pan, allowing them to perceive the intended word clean. This phenomenon is referred to as compensation for assimilation and happens automatically-- that is, humans compensate without conscious awareness of the assimilation itself. Psycholinguistic research has used controlled stimuli to investigate the mechanism behind this process.

assimilation, compensation, stimuli, (17 more...)

arXiv.org Artificial Intelligence

Jun-21-2024

arXiv.org PDF

Add feedback

Country:
- South America > Chile
  - Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America
  - United States (0.04)
  - Canada > Ontario
    - Toronto (0.04)
- Europe
  - Northern Europe (0.04)
  - Netherlands > North Holland
    - Amsterdam (0.04)

Genre:
- Research Report > New Finding (1.00)

Industry:
- Transportation (0.46)
- Leisure & Entertainment (0.46)
- Telecommunications (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Speech > Speech Recognition (1.00)
  - Natural Language (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.67)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found