corpora:historical
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
corpora:historical [2020/06/23 22:20] – [The Penn Corpora] kmiddeke | corpora:historical [2024/06/20 13:53] (current) – external edit 127.0.0.1 | ||
---|---|---|---|
Line 1: | Line 1: | ||
+ | ====== Using Corpora in Historical Linguistics ====== | ||
+ | ===== Available corpora ===== | ||
+ | ==== The Penn Corpora ==== | ||
+ | === Resources === | ||
+ | * {{ : | ||
+ | * {{ : | ||
+ | * {{ : | ||
+ | === About === | ||
+ | The Penn corpora are | ||
+ | * The **PPEME2** (Kroch, Anthony & Ann Taylor. 2000. //The Penn-Helsinki Parsed Corpus of **Middle English**// | ||
+ | * The **PPCEME** (Kroch, Anthony, Beatrice Santorini, and Lauren Delfs. 2004. //The Penn-Helsinki Parsed Corpus of **Early Modern English**// | ||
+ | * The **PPCEEC** (Nurmi, Arja, Ann Taylor, Anthony Warner, Susan Pintzuk and Terttu Nevalainen. 2006. //Parsed Corpus of **Early English Correspondence**// | ||
+ | * The **PPCMBE** (**Modern British English**) | ||
+ | |||
+ | === Notes === | ||
+ | The Penn corpora are really great, because you can use the exact same queries for all of them, which makes results directly comparable. But there are a few things to watch out for: | ||
+ | |||
+ | When working with historical corpora, it is especially useful to work with pos-tags, since these corpora are **not lemmatized** and the texts follow **no standard orthography**. So, whenever possible, use pos-tags, e.g. to find forms of the auxiliary //do// etc. If that is not possible, consult the OED to get an idea of the possible spelling variants of the words you are interested in. | ||
+ | |||
+ | If you need to restrict your query to a specific sub-corpus, remember that the command is [yourquery]**:: | ||
+ | |||
+ | === Related corpora === | ||
+ | Other corpora with broadly the same tagset include | ||
+ | * the **[[https:// | ||
+ | * the **[[https:// | ||
+ | * the **[[https:// | ||
+ | * the **[[http:// | ||
+ | * the **[[http:// | ||
+ | |||
+ | Please contact your lecturer for information on how to access and use these corpora, should you be interested. |