Comparing ChatGPT to Human Raters and Sentiment Analysis Tools for German Children's Literature

:speech_balloon: Speaker: Simone Rebora, Marina Lehmann, Anne Heumann, Wei Ding and Gerhard Lauer

:classical_building: Affiliation: 1, Department of Foreign Languages and Literatures, University of Verona, Italy; 2, Department of Book and Reading Studies, Johannes Gutenberg-University Mainz, Germany

Title: Comparing ChatGPT to Human Raters and Sentiment Analysis Tools for German Children’s Literature

Abstract: In this paper, we apply the ChatGPT Large Language Model (gpt-3.5-turbo) to the 4books dataset, a German language collection of children’s and young adult novels comprising a total of 22,860 sentences annotated for valence by 80 human raters. We verify if ChatGPT can (a) compare to the behaviour of human raters and/or (b) outperform state of the art sentiment analysis tools. Results show that, while inter-rater agreement with human readers is low (independently from the inclusion/exclusion of context), efficiency scores are comparable to the most advanced sentiment analysis tools.

:newspaper: Link to paper

:file_folder: