Vis enkel innførsel

dc.contributor.authorBerdicevskis, Aleksandrs
dc.contributor.authorEckhoff, Hanne Martine
dc.date.accessioned2015-06-11T10:35:03Z
dc.date.available2015-06-11T10:35:03Z
dc.date.issued2015
dc.description.abstractWe describe automatic conversion of the SynTagRus dependency treebank of Russian to the PROIEL format (with the ultimate purpose of obtaining a single-format diachronic treebank spanning more than a thousand years), focusing on analysis of shared arguments in verbal coordinations. Whether arguments are shared or private is not marked in the SynTagRus native format, but the PROIEL format indicates sharing by means of secondary dependencies. In order to recover missing information and insert secondary dependencies into the converted SynTagRus, we create a simple guessing algorithm based on four probabilistic features: how likely a given argument type is to be shared; how likely an argument in a given position is to be shared; how likely a given verb is to have a given argument; how likely a given verb is to have a given argument frame. Boosted with a few deterministic rules and trained on a small manually annotated sample (346 sentences), the guesser very successfully inserts shared subjects (F-score 0.97), which results in excellent overall performance (F-score 0.92). Non-subject arguments are shared much more rarely, and for them the results are poorer (0.31 for objects; 0.22 for obliques). We show, however, that there are strong reasons to believe that performance can be increased if a larger training sample is used and the guesser gets to see enough positive examples. Apart from describing a useful practical solution, the paper also provides quantitative data about and offers non-trivial insights into Russian verbal coordination.en_US
dc.identifier.citationKompiuternaia lingvistika i intellektual'nye tekhnologii (2015) nr. 14 (21) s. 33-43en_US
dc.identifier.cristinIDFRIDAID 1245637
dc.identifier.issn2221-7932
dc.identifier.urihttps://hdl.handle.net/10037/7733
dc.identifier.urnURN:NBN:no-uit_munin_7323
dc.language.isoengen_US
dc.rights.accessRightsopenAccess
dc.subjecttreebanken_US
dc.subjectcoordinationen_US
dc.subjectdependency syntaxen_US
dc.subjectshared dependentsen_US
dc.subjectshared modifiersen_US
dc.subjectshared argumentsen_US
dc.subjectRussianen_US
dc.subjectVDP::Humaniora: 000::Språkvitenskapelige fag: 010en_US
dc.subjectVDP::Humanities: 000::Linguistics: 010en_US
dc.titleAutomatic Identification of Shared Arguments in Verbal Coordinationsen_US
dc.typeJournal articleen_US
dc.typeTidsskriftartikkelen_US
dc.typePeer revieweden_US


Tilhørende fil(er)

Thumbnail
Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel