Russian natural language processing for computer-assisted language learning: Capturing the benefits of deep morphological analysis in real-life applications

Reynolds, Robert

dc.contributor.advisor	Janda, Laura
dc.contributor.author	Reynolds, Robert
dc.date.accessioned	2016-09-14T11:07:34Z
dc.date.available	2016-09-14T11:07:34Z
dc.date.issued	2016-08-15
dc.description.abstract	In this dissertation, I investigate practical and theoretical issues surrounding the use of natural language processing technology in the context of Russian Computer-Assisted Language-Learning, with particular emphasis on morphological analysis. In Part I, I present linguistic and practical issues surrounding the development and evaluation of two foundational technologies: a two-level morphological analyzer, and a constraint grammar to contextually disambiguate homonymy in the analyzer’s output. The analyzer was specially designed for L2 learner applications—with stress annotation and rule-based morphosyntactic disambiguation—and it is competitive with state-of-the-art Russian analyzers. The constraint grammar is designed to have high recall, allowing an L2-learner application to base decisions on all possible readings, and not just the single most likely reading. The constraint grammar resolves 44% of the ambiguity output by the morphological analyzer. A voting setup combining the constraint grammar with a trigram hidden markov model tagger demonstrates how a high-recall grammar can boost performance of probabilistic taggers, which are better suited to capturing highly idiosyncratic facts about collocational tendencies. In Part II, I present linguistic, theoretical, practical issues surrounding the application of the morphological analyzer and constraint grammar to three real-life computer-assisted language-learning tasks: automatic stress annotation, automatic grammar exercise generation from authentic texts, and automatic evaluation of text readability. The automatic stress placement task is vital for Russian language-learning applications. The morphological analyzer and constraint grammar yield state-of-the-art results, resolving 42% of stress ambiguity in a corpus of running text. In order to demonstrate the value of a high-recall constraint grammar, I developed Russian grammar activities for the VIEW platform, a system for providing automatic Visual Input Enhancement of Web documents. This system allows teachers and learners to automatically generate grammatical highlighting, identification activities, multiple-choice activities, and fill-in-the-blank activities, enabling them to study grammar using texts that are interesting or relevant to them. I show that the morphological analysis described above is instrumental not only for generating exercises, but also for providing adaptive feedback, a feature which typically requires encoding specific learner language features. A final test-case for morphological analysis in Russian language-learning is automatic readability assessment, which can help learners and teachers find texts at appropriate reading levels. I show that features based on morphology are among the most informative for L2 readability assessment.	en_US
dc.description.doctoraltype	ph.d.	en_US
dc.description.popularabstract	Tradisjonelle metoder for å utvikle dataprogrammer for datamaskin-assistert språklæring er langsomme og dyre, og de resulterende programmene foreldes fort. I denne avhandlingen demonstrerer jeg ny datateknologi som med utgangspunkt i tekster på internett automatisk genererer oppgaver for russisk språklæring. Denne teknologien kan også hjelpe brukere med å finne tekster på passende nivåer, noe som gir brukerne mulighet til å oppdage og lese tekster som er interessante for dem, uten hjelp av lærere. Denne teknologien gir videre lærere mulighet til å automatisere generering av grammatiske oppgaver. Avhandlingen behandler følgende komponenter: automatisk grammatisk analyse av russiske ord, fjerning av grammatisk tvetydighet, automatisk generering av grammatiske oppgaver og automatisk vurdering av lesbarhet. Særlig understrekes viktigheten av grammatisk analyse innenfor automatisering av russisk datamaskin-assistert språklæring.	en_US
dc.description.sponsorship	The Faculty of Humanities, Social Sciences and Education funded my stipendiat position and awarded a grant to pay for programmers, and to purchase readability corpora.	en_US
dc.identifier.uri	https://hdl.handle.net/10037/9685
dc.language.iso	eng	en_US
dc.publisher	UiT Norges arktiske universitet	en_US
dc.publisher	UiT The Arctic University of Norway	en_US
dc.rights.accessRights	openAccess
dc.rights.holder	Copyright 2016 The Author(s)
dc.rights.uri	https://creativecommons.org/licenses/by-nc-sa/3.0	en_US
dc.rights	Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0)	en_US
dc.subject	VDP::Humanities: 000::Linguistics: 010::Russian language: 028	en_US
dc.subject	VDP::Humaniora: 000::Språkvitenskapelige fag: 010::Russisk språk: 028	en_US
dc.subject	VDP::Humanities: 000::Linguistics: 010::Applied linguistics: 012	en_US
dc.subject	VDP::Humaniora: 000::Språkvitenskapelige fag: 010::Anvendt språkvitenskap: 012	en_US
dc.subject	VDP::Technology: 500::Information and communication technology: 550::Computer technology: 551	en_US
dc.subject	VDP::Teknologi: 500::Informasjons- og kommunikasjonsteknologi: 550::Datateknologi: 551	en_US
dc.subject	Natural Language Processing	en_US
dc.subject	Språkteknologi	en_US
dc.subject	Computer-Assisted Language Learning	en_US
dc.subject	Datamaskin-Assistert Språklæring	en_US
dc.subject	Visual Input Enhancement	en_US
dc.subject	Readability	en_US
dc.subject	Lesbarhet	en_US
dc.title	Russian natural language processing for computer-assisted language learning: Capturing the benefits of deep morphological analysis in real-life applications	en_US
dc.type	Doctoral thesis	en_US
dc.type	Doktorgradsavhandling	en_US

Tilhørende fil(er)

Navn:: license.txt
Størrelse:: 1.402Kb
Format:: Tekstfil

Åpne

Navn:: thesis.pdf
Størrelse:: 6.089Mb
Format:: PDF
Beskrivelse:: Thesis

Åpne

Denne innførselen finnes i følgende samling(er)

Doktorgradsavhandlinger (HSL-fak) [376]

Vis enkel innførsel

Med mindre det står noe annet, er denne innførselens lisens beskrevet som Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0)

Russian natural language processing for computer-assisted language learning: Capturing the benefits of deep morphological analysis in real-life applications

Tilhørende fil(er)

Denne innførselen finnes i følgende samling(er)

Relaterte innførsler

Presenting the Sámi when learning Norwegian. An analysis of the representation of the Sámi in Norwegian as a Foreign and Second Language textbooks. ﻿

Limits on P: filling in holes vs. falling in holes ﻿

The role of aktionsart in deverbal nouns: State nominalizations across languages ﻿

Presenting the Sámi when learning Norwegian. An analysis of the representation of the Sámi in Norwegian as a Foreign and Second Language textbooks.

Limits on P: filling in holes vs. falling in holes

The role of aktionsart in deverbal nouns: State nominalizations across languages