ub.xmlui.mirage2.page-structure.muninLogoub.xmlui.mirage2.page-structure.openResearchArchiveLogo
    • EnglishEnglish
    • norsknorsk
  • Velg spraakEnglish 
    • EnglishEnglish
    • norsknorsk
  • Administration/UB
View Item 
  •   Home
  • Fakultet for humaniora, samfunnsvitenskap og lærerutdanning
  • Institutt for språk og kultur
  • Artikler, rapporter og annet (språk og kultur)
  • View Item
  •   Home
  • Fakultet for humaniora, samfunnsvitenskap og lærerutdanning
  • Institutt for språk og kultur
  • Artikler, rapporter og annet (språk og kultur)
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Linguistics vs. digital editions: The Tromsø Old Russian and OCS Treebank

Permanent link
https://hdl.handle.net/10037/22366
Thumbnail
View/Open
article.pdf (4.741Mb)
Published version (PDF)
Date
2015
Type
Journal article
Tidsskriftartikkel
Peer reviewed

Author
Eckhoff, Hanne Martine; Berdicevskis, Aleksandrs
Abstract

The Tromsø Old Russian and OCS Treebank (TOROT, nestor.uit.no)1 is, along with its parent treebank, the PROIEL corpus (foni.uio.no), the only existing treebank of Old Church Slavonic (OCS), Old East Slavic and Middle Russian texts. There are other tagged resources, such as the Old Russian subcorpus of the Russian National Corpus2 and the Manuskript corpus,3 but none of them, to our knowledge, currently provide syntactic annotation.

The TOROT presently contains approximately 160,000 word tokens of fully annotated OCS (Codex Marianus4 and Codex Suprasliensis), 85,000 word tokens of fully annotated Kiev-era Old East Slavic, and 60,000 word tokens of fully annotated 15th–17th-century Middle Russian. In addition, it contains the Codex Zographensis with automatic and partially hand-corrected morphological annotation and lemmatisation (sections of the Gospels missing in the Codex Marianus also have full syntactic annotation), and the PROIEL version of the Greek Gospels, with which the Codex Marianus and the Codex Zographensis are both aligned at token level (automatically, then hand-corrected).

Description
Source at http://e-scripta.ilit.bas.bg/archives/year-2015/issue-14-15. Journal home page at http://e-scripta.ilit.bas.bg/.
Publisher
Institute for Literature, Bulgarian Academy of Sciences
Citation
Eckhoff HM, Berdicevskis A. Linguistics vs. digital editions: The Tromsø Old Russian and OCS Treebank. Scripta & e-Scripta. 2015;14-15:9-25
Metadata
Show full item record
Collections
  • Artikler, rapporter og annet (språk og kultur) [1477]
Copyright 2015 The Author(s)

Browse

Browse all of MuninCommunities & CollectionsAuthor listTitlesBy Issue DateBrowse this CollectionAuthor listTitlesBy Issue Date
Login

Statistics

View Usage Statistics
UiT

Munin is powered by DSpace

UiT The Arctic University of Norway
The University Library
uit.no/ub - munin@ub.uit.no

Accessibility statement (Norwegian only)