Reference Corpus Middle High German (1050 - 1350)


The project "Reference Corpus Middle High German (1050-1350)", abbreviated as ReM, will provide a reference corpus for the Middle High German period being part of a corpus of historical German texts (formerly German Diachronic Digitally). The aim of the project "ReM" is to create a database which is sufficient, reliable and accurate with regard to the written records of Middle High German (1050-1350). Historio-linguistic and mediavalist research will be immensely facilitated as all data is made accessible electronically. In cooperation with the project "Old German Reference Corpus (750-1050)" an annotation standard "DDDTS" was developed based on STTS (Stuttgart-Tübingen Tagset), but adapted and extended for the special needs of the older periods of the German language.

The structure and the provision of the corpus are:

  1. The written records of Early Middle High German from c. 1050 till 1200 / beginning of the 13th century have been almost completely digitalised and linguistically annotated (part-of-speech tagging, inflection, syntactic).
  2. In the second phase of the project a structured selection of Middle High German texts based on manuscript records will be made available in the same way. In addition, texts which had been only partly digitalised, will be completed and annotated. The aim is to improve the basis for research on syntactical structures significantly.
  3. Furthermore the "ReM" corpus will be complemented with Middle High German texts which have been linguistically annotated before, but not yet adapted to the DDDTS standard; these are mainly texts from the corpus of the new Middle High German Grammar (MiGraKo).
  4. Finally the "ReM" will be converted to the XML standoff format PAULA and will be made accessible for researchers with the linguistic ANNIS data base.
All in all, the corpus of the Project "Reference Corpus Middle High German (1050-1350) will contain c. 2.4 million digitalised and c. 2.1 million annotated word forms.

Comprehensive information about the project can be found on the new homepage of the "ReM" corpus at https://www.linguistics.rub.de/rem/.

After the successful completion of the second project phase, the reference corpus Mittelhochdeutsch (RefM) is now available online to all users. The corpus search is available at www.linguistics.rub.de/rem/access/index.html.