Asymmetries in mutual intelligibility: An information-theoretic approach

Tarassov Colomar, Víctor (Sánchez, David)
Master Thesis (2024)

This study explores the phonetic distance between different languages using relative entropy, a measure of similarity between probability distributions of an intrinsically asymmetric nature. Phonetic probability distributions are generated from Bible translations and the Python library Phonemizer, which is used to create phonetic transcriptions. After validating the transcriptions generated by Phonemizer with two databases (WikiPron and PHOR-in-One), entropy values are calculated for languages from three main Indo-European language families: Romance, Slavic, and Germanic. The entropy values within each family and the resulting asymmetry are analyzed, distinguishing between vowel and consonant phonemes. It is observed that, in Slavic languages, the behavior and, therefore, the asymmetry in entropy are mainly associated with the presence or absence of vowel phonemes. In Romance languages, the correlation is stronger with vowel phonemes, though it is also relevant for consonants. In Germanic languages, the correlation is predominantly seen in consonant phonemes. Finally, the correlation between the asymmetry of relative entropy and the asymmetry in mutual intelligibility between languages is analyzed, demonstrating that intelligibility is a complex phenomenon dependent on many factors beyond the phonetic. The study suggests that future research should delve deeper into phoneme distributions, with a particular focus on transcriptions of cognate words.


Esta web utiliza cookies para la recolección de datos con un propósito estadístico. Si continúas navegando, significa que aceptas la instalación de las cookies.


Más información De acuerdo