Hey, can anyone recommend me readings on the basics of validation? (Esp. within data-intensive, but insight-driven research workflows)
Basics on how, when, why to check your search/model/algorithm results, building gold standards, manual annotation tips, terminology, synthetic data etc?
Looking for basic papers to read for students in an introductory data+dh course, but also improve my own skills on this. It is one of the topics that is perhaps widely ignored or unsystematically done in DH papers. On the other hand, not sure if simple NLP foundations on this give the most relevant tips. (E.g. NLP usually has ground truth to rely on, and complex insight-driven workflows may need checking in various points in the cycle.) But maybe all the tips are the same…
Any recommendations welcome, thanks!