Russian natural language processing for computer-assisted language learning: Capturing the benefits of deep morphological analysis in real-life applications
In this dissertation, I investigate practical and theoretical issues surrounding the use of natural language processing technology in the context of Russian Computer-Assisted Language-Learning, with particular emphasis on morphological analysis. In Part I, I present linguistic and practical issues surrounding the development and evaluation of two foundational technologies: a two-level morphological analyzer, and a constraint grammar to contextually disambiguate homonymy in the analyzer’s output. The analyzer was specially designed for L2 learner applications—with stress annotation and rule-based morphosyntactic disambiguation—and it is competitive with state-of-the-art Russian analyzers. The constraint grammar is designed to have high recall, allowing an L2-learner application to base decisions on all possible readings, and not just the single most likely reading. The constraint grammar resolves 44% of the ambiguity output by the morphological analyzer. A voting setup combining the constraint grammar with a trigram hidden markov model tagger demonstrates how a high-recall grammar can boost performance of probabilistic taggers, which are better suited to capturing highly idiosyncratic facts about collocational tendencies. In Part II, I present linguistic, theoretical, practical issues surrounding the application of the morphological analyzer and constraint grammar to three real-life computer-assisted language-learning tasks: automatic stress annotation, automatic grammar exercise generation from authentic texts, and automatic evaluation of text readability. The automatic stress placement task is vital for Russian language-learning applications. The morphological analyzer and constraint grammar yield state-of-the-art results, resolving 42% of stress ambiguity in a corpus of running text. In order to demonstrate the value of a high-recall constraint grammar, I developed Russian grammar activities for the VIEW platform, a system for providing automatic Visual Input Enhancement of Web documents. This system allows teachers and learners to automatically generate grammatical highlighting, identification activities, multiple-choice activities, and fill-in-the-blank activities, enabling them to study grammar using texts that are interesting or relevant to them. I show that the morphological analysis described above is instrumental not only for generating exercises, but also for providing adaptive feedback, a feature which typically requires encoding specific learner language features. A final test-case for morphological analysis in Russian language-learning is automatic readability assessment, which can help learners and teachers find texts at appropriate reading levels. I show that features based on morphology are among the most informative for L2 readability assessment.
ForlagUiT Norges arktiske universitet
UiT The Arctic University of Norway
Følgende lisensfil er knyttet til denne innførselen:
Viser innførsler relatert til tittel, forfatter og emneord.
Fábregas, Antonio; Marín, Rafael (Journal article; Tidsskriftartikkel; Peer reviewed, 2012)Most of the literature devoted to the study of deverbal nominalizations concentrates on the complex event reading (La concentración de partículas tiene lugar a temperatura ambiente, ‘The concentration of particles takes ...
Svenonius, Peter (Journal article; Tidsskriftartikkel; Peer reviewed, 2003)All Germanic languages make extensive use of verb-particle combinations (known as separable-prefix verbs in the OV languages). I show some basic differences here distinguishing the Scandinavian type from the OV West Germanic ...
Gjervold, Jonas Hinrichsen (Master thesis; Mastergradsoppgave, 2014-05-28)The prefixation of base imperfective verbs to create semantically identical perfective partner verbs is a central feature of the Russian verb. There are sixteen such perfectivizing prefixes and prefix variation is when a ...