From sunblock to softblock: Analyzing the correlates of neology in published writing and on social media
Authors
Maria Ryskina Matthew R. Gormley Kyle Mahowald David R. Mortensen Taylor Berg-Kirkpatrick Vivek Kulkarni
Abstract
Living languages are shaped by a host of conflicting internal and external evolutionary pressures. While some of these pressures are universal across languages and cultures, others differ depending on the social and conversational context: language use in newspapers is subject to very different constraints than language use on social media. Prior distributional semantic work on English word emergence (neology) identified two factors correlated with creation of new words by analyzing a corpus consisting primarily of historical published texts (Ryskina et al., 2020, arXiv:2001.07740). Extending this methodology to contextual embeddings in addition to static ones and applying it to a new corpus of Twitter posts, we show that the same findings hold for both domains, though the topic popularity growth factor may contribute less to neology on Twitter than in published writing. We hypothesize that this difference can be explained by the two domains favouring different neologism formation mechanisms.
Paper Summary
Problem
Key Innovation
Practical Impact
Analogy / Intuitive Explanation
Paper Information
2602.13123v1