Adjectivization in Russian: Analyzing participles by means of lexical frequency and constraint grammar
This dissertation explores the factors that restrict and facilitate adjectivization in Russian, an affixless part-of-speech change leading to ambiguity between participles and adjectives. I develop a theoretical framework based on major approaches to adjectivization, and assess the effect of the factors on ambiguity in the empirical data. I build a linguistic model using the Constraint Grammar formalism. The model utilizes the factors of adjectivization and corpus frequencies as formal constraints for differentiating between participles and adjectives in a disambiguation task. The main question that is explored in this dissertation is which linguistic factors allow for the differentiation between adjectivized and unambiguous participles. Another question concerns which factors, syntactic or morphological, predict ambiguity in the corpus data and resolve it in the disambiguation model. In the theoretical framework, the syntactic context signals whether a participle is adjectivized, whereas internal morphosemantic properties (that is, tense, voice, and lexical meaning) cause or prevent adjectivization. The exploratory analysis of these factors in the corpus data reveals diverse results. The syntactic factor, the adverb of measure and degree očenʹ ‘very’, which is normally used with adjectives, also combines with participles, and is strongly associated with semantic classes of their base verbs. Nonetheless, the use of očenʹ with a participle only indicates ambiguity when other syntactic factors of adjectivization are in place. The lexical frequency (including the ranks of base verbs and the ratios of participles to other verbal forms) and several morphological types of participles strongly predict ambiguity. Furthermore, past passive and transitive perfective participles not only have the highest mean ratios among the other morphological types of participles, but are also strong predictors of ambiguity. The linguistic model using weighted syntactic rules shows the highest accuracy in disambiguation compared to the models with weighted morphological rules or the rule based on weights only. All of the syntactic, morphological, and weighted rules combined show the best performance results. Weights are the most effective for removing residual ambiguity (similar to the statistical baseline model), but are outperformed by the models that use factors of adjectivization as constraints.
PublisherUiT Norges arktiske universitet
UiT The Arctic University of Norway
The following license file are associated with this item: