Robert K. Paulsen
I would like to present two closely related attempts of making Old Norse texts digitally available and annotating them with different kinds of linguistic meta-information: The Medieval Nordic Text Archive (Menota) and the Menotec project.
The specially developed Menota XML-standard allows for a three-level transcription, lemmatization and morphological analysis of each word, and the database now contains about 1.1 million tokens from different time periods, dialects and genres of the Old Norse tradition.
Within the Menotec infrastructure project, on the other hand, five major Old Norwegian manuscripts were transcribed, lemmatized and annotated both morphologically and syntactically, using a PROIEL-like dependency analysis.
I will discuss the special issues arising from dealing with non-regulated historical languages, as well as the guidelines, standards and norms necessary for dealing with these issues and how they are developed. Finally, I want to give some perspectives: What further work is being done, what is possible or necessary? and norms necessary for dealing with these issues and how they are developed.