Open Mic Session
Go here for the link to the gather!
This year CHR is hosting an Open Mic session. Presentations at the Open Mic can be on any topic of interest to the CHR community. It doesn’t have to be about published research, but may also be about ongoing research or about something you’ve learned, a particular technique, or an interesting software package, a visualization technique, or whatever you can think of.
In short, it doesn’t have to be big. If you are hesitant to tell something about X, it’s probably a very good idea to tell something about it! Don’t think that everyone already knows about it (it is very likely that they don’t), that you not expert enough (you undoubtedly know a lot more about it than most), or that your idea is not interesting (it’s probably better than you think).
Want to join?
Presentations may be up to 4 minutes in length. Slides, videos or other multimedia are encouraged. You can still sign up! If you have an idea for a presentation, please let us know here with a brief description of your idea.
When and where?
The Open Mic Session will take place on Thursday 18 November 19:00 → 20:00. The session will be hosted by @mike.kestemont (University of Antwerp) and @l.fonteyn (Leiden University). The session will take place on Gather. See the CHR2021 schedule for the link.
Confirmed Speakers
-
Mai Zaki (@MaiZaki)
Visualisations of cultural data through the glossaries of translated Arabic literature
-
Srishti Sharma (@srish1108) and Federico Pianzola (@fpianz)
We submitted a registered report for a study testing whether the sentiment of a story can predict the emotional reaction of readers (using various sentiment analysis techniques and Goodreads reviews). I’d like to receive suggestions for the exploratory data analysis that we can additionally do and also receive more feedback on our methodology.
-
Oleg Sobchuk (@oleg_sobchuk)
I’d like to present a work in progress, done with Artjoms Šela. Prospective title: “Computational thematics: Algorithmic recognition of book genres”. We ask the question: which methods of unsupervised learning are most suited for capturing broad thematic similarities between texts? (Similarly to how certain algorithms may be better or worse for capturing the stylistic signal of authorship.) We assembled a “ground-truth” corpus of 200 books from 4 genres (fantasy, science fiction, crime, and romance) from 1950–2000. The algorithms are then given the task of categorizing the books, with genre labels removed, into clusters as similar as possible to the original genre categories (cluster similarity is measured with Adjusted Rand Index). The tested algorithms belong to several types – steps of analysis: [1] various approaches to pre-processing (removing stopwords, lemmatizing, POS tagging, etc.), [2] approaches to feature extraction (LDA topics, WGCNA modules, most frequent words, doc2vec dimensions), [3] approaches to measuring similarity between feature vectors (Delta, Euclidian distance, Jensen-Shannon divergence, etc.). We find the methods that are best and worst for the job, and speculate about the possible uses of computational thematics.
-
Erin E. McCabe (@ErinEMcC)
My open mic presentation would be on a project (in progress) where participants who are processing medical trauma respond to expressive writing entries. Those entries are then used to generate poetry in hopes of encouraging participant engagement. The tests for this project also allow us to explore different language models, as well as how evaluating them from a poetry perspective is a unique sort of Turing test.
-
Artjoms Ĺ ela (@ash)
I would like to quickly introduce our research project (together with Thomas Müller, Oleg Sobchuk and James Winters) on video games speedrunning and the discovery process in culture. Speedrunning is a competitive practice of completing a game as fast as possible; its popularity has surged in recent years with the advent of streaming platforms and video capture technologies. To minimize time and beat previous records players go to great lengths, painstakingly optimizing their performance, or exploring and exploiting technical logic of a game. In other words, players constantly innovate: these innovations can be either incremental (optimization), or paradigm-shifting (discovery of glitches, arbitrary code execution). Using speedrunning histories from thousands of games (speedrun.com) we look at how the innovation process in culture is shaped. Our results show that 1) major discoveries happen randomly (they are not conditioned on the amount of preceding attempts); 2) these major discoveries open floodgates for cascades of micro-improvements (micro-innovations tend to concentrate immediately after a macro-innovation happens). These innovation patterns in speedruns broadly support the "punctuated equilibrium’’ view on the pace of cultural evolution.
-
Marion Riggs (@mriggs)
I will discuss the use of network graphs to investigate spatiotemporal changes in language use.
(A more detailed description: I am interested in finding computational means of assessing changes in the use of the word “baroque“ in the eighteenth century. I can use a series of cartographic maps to consider changes in where “baroque” was used over time and a series of bubble graphs to consider linguistic changes in the use of “baroque” over time, and I think both of these make good sense. However, I would also like to blend these two types of analyses together using network graphs, and I would like feedback from other researchers on whether this seems feasible.)
You can still sign up! If you have an idea for a presentation, please let us know here with a brief description of your idea.