An exploration of Lexical Patterns in APS Journals

Broadcast soon

Please note that this sociomeeting will be broadcasted live in Zoom and also in the Seminar Room.



Scientific disciplines are often characterized by their unique terminologies. This study performs a large-scale statistical linguistics analysis on the corpus of eight American Physical Society (APS) journals to quantitatively compare different fields of physics. We measure the linguistic similarity between these journals using rank-based metrics, examine the dynamics of their vocabularies through rank diversity, and identify the most characteristic terms for each subfield.



Our results indicate a high degree of lexical similarity among all journals, suggesting a shared linguistic core for the physics discipline that is distinct from other text corpora, such as books or social media. Despite this overlap, we find that each journal possesses a unique "linguistic fingerprint" of specialized words. Using this, we developed a statistical classifier that identifies an article's source journal with high accuracy, using the top 1000 most frequent words. Furthermore, an analysis of physicist mentions reveals a significant difference between the fame of a scientist within a specialized field versus their broader cultural recognition. These findings demonstrate that statistical linguistics provides a framework for mapping the structure and evolution of scientific fields, revealing their common foundations and distinct specializations.



Contact details:

Pablo Rosillo-Rodes

Contact form


This web uses cookies for data collection with a statistical purpose. If you continue browsing, it means acceptance of the installation of the same.


More info I agree