Vis enkel innførsel

dc.contributor.advisorJohansen, Håvard
dc.contributor.advisorJenssen, Robert
dc.contributor.authorKampffmeyer, Michael Christian
dc.date.accessioned2015-08-26T13:11:34Z
dc.date.available2015-08-26T13:11:34Z
dc.date.issued2015-06-15
dc.description.abstractCollaborative filtering recommender systems have become essential to many Internet services, providing, for instance, book recommendations at Amazon's online e-commerce service, music recommendation in Spotify and movie recommendation in Netflix. Matrix factorization and Restricted Boltzmann Machines (RBMs) are two popular methods for implementing recommender systems, both providing superior accuracy over common neighborhood models. Both methods also shift much of the computation from the prediction phase to the model training phase, which enables fast predictions once the model has been trained. This thesis suggests a novel approach for performing matrix factorization using the Alternating-Least-Squares with Weighted-Lambda-Regularization (ALS-WR) algorithm on CUDA (ALS-CUDA). The algorithm is implemented and evaluated in the context of recommender systems by comparing it to other commonly used approaches. These include an RBM and a stochastic gradient descent (SGD) approach. Our evaluation shows that significant speedups can be achieved by using CUDA and GPUs for training recommender systems. The ALS-CUDA algorithm implemented in this thesis provided speedup factors of up to 175.4 over the sequential CPU ALS implementation and scales linearly with the number of CUDA threads assigned to it until the GPUs shared memory has been saturated. Comparing the performance of the ALS-CUDA algorithm to CUDA implementations of the SGD and the RBM algorithms shows that the ALS-CUDA algorithm outperformed the RBM. For a sparse dataset, results indicate that the ALS-CUDA algorithm performs slightly worse than the SGD implementation, while for a dense dataset, ALS-CUDA outperforms the SGD. However, generally the advantage of the ALS-CUDA algorithm does not necessarily lie in its speed, but also in the fact that it requires fewer parameters than the SGD. It therefore represents a viable option when some speed can be traded off for algorithmic stability, or when the dataset is dense.en_US
dc.identifier.urihttps://hdl.handle.net/10037/7987
dc.identifier.urnURN:NBN:no-uit_munin_7569
dc.language.isoengen_US
dc.publisherUiT Norges arktiske universiteten_US
dc.publisherUiT The Arctic University of Norwayen_US
dc.rights.accessRightsopenAccess
dc.rights.holderCopyright 2015 The Author(s)
dc.rights.urihttps://creativecommons.org/licenses/by-nc-sa/3.0en_US
dc.rightsAttribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0)en_US
dc.subject.courseIDINF-3981en_US
dc.subjectVDP::Matematikk og Naturvitenskap: 400::Matematikk: 410::Anvendt matematikk: 413en_US
dc.subjectVDP::Mathematics and natural science: 400::Mathematics: 410::Applied mathematics: 413en_US
dc.subjectVDP::Matematikk og Naturvitenskap: 400::Informasjons- og kommunikasjonsvitenskap: 420::Algoritmer og beregnbarhetsteori: 422en_US
dc.subjectVDP::Mathematics and natural science: 400::Information and communication science: 420::Algorithms and computability theory: 422en_US
dc.titleParallelization of the Alternating-Least-Squares Algorithm With Weighted Regularization for Efficient GPU Execution in Recommender Systemsen_US
dc.typeMaster thesisen_US
dc.typeMastergradsoppgaveen_US


Tilhørende fil(er)

Thumbnail
Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel

Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0)
Med mindre det står noe annet, er denne innførselens lisens beskrevet som Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0)