Introducing Functional Diversity: A Novel Approach to Lexical Diversity in (Historical) Corpora

:speech_balloon: Speaker: Folgert Karsdorp (1), Enrique Manjavacas (2) and Lauren Fonteyn (2)

:classical_building: Affiliation: (1) KNAW Meertens Institute, Amsterdam, the Netherlands; (2) Leiden University, Leiden, the Netherlands

Title: Introducing Functional Diversity: A Novel Approach to Lexical Diversity in (Historical) Corpora

Abstract: The question how we can reliably estimate the lexical diversity of a particular text (collection) has often been asked by linguists and literary scholars alike. This short paper introduces a way of operationalizing functional diversity measurements by means of token-based embeddings, and argues that functional diversity is not only a practically advantageous, but also a theoretically relevant addition to the Computational Humanities Research toolkit. By means of an experiment on the historical ARCHER corpus, we show that lexical diversity at the level of functional groups is less sensitive to orthographic variation, and provides insight into an important and often disregarded dimension of vocabulary diversity in textual data.

:newspaper: Link to paper

Check out the video here:

1 Like

Have you thought about the link between functional diversity and cognitive models of categorization? In other words, how does diversity evolve over time and might it shed light on how we deal with categorization in a cultural context?

Chater et al. - Bayesian Models of Cognition has a short section on categorization with references that might be of use here.