When Dialects Collide: How Socioeconomic Mixing Affects Language Use

📅 2023-07-19
🏛️ arXiv.org
📈 Citations: 3
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates how socioeconomic mixing attenuates the association between income and nonstandard English usage. Leveraging geolocated tweets from 7,000 administrative areas in England and Wales, high-resolution income data, transformer-based NLP models for grammatical deviation detection, and a multi-agent dialect diffusion simulation, we demonstrate—for the first time—that the negative correlation between income and linguistic variation is significantly weakened in areas with high socioeconomic mixing. This effect is robust across eight major metropolitan regions. Our interpretable computational model successfully replicates language convergence mechanisms, revealing that residential class integration reduces dialectal markedness by increasing cross-class linguistic exposure. The work establishes the first quantitative framework in sociolinguistics integrating large-scale social media analysis with causal mechanism simulation, thereby moving beyond traditional correlational approaches to enable mechanistic inference about language-inequality dynamics.
📝 Abstract
The socioeconomic background of people and how they use standard forms of language are not independent, as demonstrated in various sociolinguistic studies. However, the extent to which these correlations may be influenced by the mixing of people from different socioeconomic classes remains relatively unexplored from a quantitative perspective. In this work we leverage geotagged tweets and transferable computational methods to map deviations from standard English on a large scale, in seven thousand administrative areas of England and Wales. We combine these data with high-resolution income maps to assign a proxy socioeconomic indicator to home-located users. Strikingly, across eight metropolitan areas we find a consistent pattern suggesting that the more different socioeconomic classes mix, the less interdependent the frequency of their departures from standard grammar and their income become. Further, we propose an agent-based model of linguistic variety adoption that sheds light on the mechanisms that produce the observations seen in the data.
Problem

Research questions and friction points this paper is trying to address.

How socioeconomic mixing influences language use patterns
Quantifying impact of class mixing on standard grammar deviations
Modeling mechanisms behind socioeconomic-linguistic interdependence variations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Geotagged tweets map English deviations
Income proxies from high-resolution maps
Agent-based model explains linguistic variety
🔎 Similar Papers
No similar papers found.
T
Thomas Louf
Institute for Cross-Disciplinary Physics and Complex Systems IFISC (UIB-CSIC), Palma de Mallorca, Spain
J
José J. Ramasco
Institute for Cross-Disciplinary Physics and Complex Systems IFISC (UIB-CSIC), Palma de Mallorca, Spain
D
David Sánchez
Institute for Cross-Disciplinary Physics and Complex Systems IFISC (UIB-CSIC), Palma de Mallorca, Spain
Márton Karsai
Márton Karsai
Central European University
Complex NetworksHuman Dynamics