Posts with tag LiteratureBack to all posts
As promised, some quick thoughts broken off my post on Dunning Log-likelihood. There, I looked at _big_ corpuses–two history classes of about 20,000 books each. But I also wonder how we can use algorithmic comparison on a much smaller scale: particularly, at the level of individual authors or works. English dept. digital humanists tend to rely on small sets of well curated, TEI texts, but even the ugly wilds of machine OCR might be able to offer them some insights. (Sidenote–interesting post by Ted Underwood today on the mechanics of creating a middle group between these two poles).