dc.contributor.advisor | Janda, Laura | |
dc.contributor.author | Reynolds, Robert | |
dc.date.accessioned | 2016-09-14T11:07:34Z | |
dc.date.available | 2016-09-14T11:07:34Z | |
dc.date.issued | 2016-08-15 | |
dc.description.abstract | In this dissertation, I investigate practical and theoretical issues surrounding the use of natural language processing technology in the context of Russian Computer-Assisted Language-Learning, with particular emphasis on morphological analysis.
In Part I, I present linguistic and practical issues surrounding the development and evaluation of two foundational technologies: a two-level morphological analyzer, and a constraint grammar to contextually disambiguate homonymy in the analyzer’s output. The analyzer was specially designed for L2 learner applications—with stress annotation and rule-based morphosyntactic disambiguation—and it is competitive with state-of-the-art Russian analyzers. The constraint grammar is designed to have high recall, allowing an L2-learner application to base decisions on all possible readings, and not just the single most likely reading. The constraint grammar resolves 44% of the ambiguity output by the morphological analyzer. A voting setup combining the constraint grammar with a trigram hidden markov model tagger demonstrates how a high-recall grammar can boost performance of probabilistic taggers, which are better suited to capturing highly idiosyncratic facts about collocational tendencies.
In Part II, I present linguistic, theoretical, practical issues surrounding the application of the morphological analyzer and constraint grammar to three real-life computer-assisted language-learning tasks: automatic stress annotation, automatic grammar exercise generation from authentic texts, and automatic evaluation of text readability. The automatic stress placement task is vital for Russian language-learning applications. The morphological analyzer and constraint grammar yield state-of-the-art results, resolving 42% of stress ambiguity in a corpus of running text.
In order to demonstrate the value of a high-recall constraint grammar, I developed Russian grammar activities for the VIEW platform, a system for providing automatic Visual Input Enhancement of Web documents. This system allows teachers and learners to automatically generate grammatical highlighting, identification activities, multiple-choice activities, and fill-in-the-blank activities, enabling them to study grammar using texts that are interesting or relevant to them. I show that the morphological analysis described above is instrumental not only for generating exercises, but also for providing adaptive feedback, a feature which typically requires encoding specific learner language features.
A final test-case for morphological analysis in Russian language-learning is automatic readability assessment, which can help learners and teachers find texts at appropriate reading levels. I show that features based on morphology are among the most informative for L2 readability assessment. | en_US |
dc.description.doctoraltype | ph.d. | en_US |
dc.description.popularabstract | Tradisjonelle metoder for å utvikle dataprogrammer for datamaskin-assistert
språklæring er langsomme og dyre, og de resulterende programmene foreldes fort.
I denne avhandlingen demonstrerer jeg ny datateknologi som med utgangspunkt i
tekster på internett automatisk genererer oppgaver for russisk språklæring.
Denne teknologien kan også hjelpe brukere med å finne tekster på passende
nivåer, noe som gir brukerne mulighet til å oppdage og lese tekster som er
interessante for dem, uten hjelp av lærere. Denne teknologien gir videre lærere
mulighet til å automatisere generering av grammatiske oppgaver. Avhandlingen
behandler følgende komponenter: automatisk grammatisk analyse av russiske ord,
fjerning av grammatisk tvetydighet, automatisk generering av grammatiske
oppgaver og automatisk vurdering av lesbarhet. Særlig understrekes viktigheten
av grammatisk analyse innenfor automatisering av russisk datamaskin-assistert
språklæring. | en_US |
dc.description.sponsorship | The Faculty of Humanities, Social Sciences and Education funded my stipendiat position and awarded a grant to pay for programmers, and to purchase readability corpora. | en_US |
dc.identifier.uri | https://hdl.handle.net/10037/9685 | |
dc.language.iso | eng | en_US |
dc.publisher | UiT Norges arktiske universitet | en_US |
dc.publisher | UiT The Arctic University of Norway | en_US |
dc.rights.accessRights | openAccess | |
dc.rights.holder | Copyright 2016 The Author(s) | |
dc.rights.uri | https://creativecommons.org/licenses/by-nc-sa/3.0 | en_US |
dc.rights | Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0) | en_US |
dc.subject | VDP::Humanities: 000::Linguistics: 010::Russian language: 028 | en_US |
dc.subject | VDP::Humaniora: 000::Språkvitenskapelige fag: 010::Russisk språk: 028 | en_US |
dc.subject | VDP::Humanities: 000::Linguistics: 010::Applied linguistics: 012 | en_US |
dc.subject | VDP::Humaniora: 000::Språkvitenskapelige fag: 010::Anvendt språkvitenskap: 012 | en_US |
dc.subject | VDP::Technology: 500::Information and communication technology: 550::Computer technology: 551 | en_US |
dc.subject | VDP::Teknologi: 500::Informasjons- og kommunikasjonsteknologi: 550::Datateknologi: 551 | en_US |
dc.subject | Natural Language Processing | en_US |
dc.subject | Språkteknologi | en_US |
dc.subject | Computer-Assisted Language Learning | en_US |
dc.subject | Datamaskin-Assistert Språklæring | en_US |
dc.subject | Visual Input Enhancement | en_US |
dc.subject | Readability | en_US |
dc.subject | Lesbarhet | en_US |
dc.title | Russian natural language processing for computer-assisted language learning: Capturing the benefits of deep morphological analysis in real-life applications | en_US |
dc.type | Doctoral thesis | en_US |
dc.type | Doktorgradsavhandling | en_US |