dc.contributor.advisor | Anshus, Otto | |
dc.contributor.advisor | Bjørndalen, John Markus | |
dc.contributor.author | Hansen, Roberth | |
dc.date.accessioned | 2018-06-20T08:10:41Z | |
dc.date.available | 2018-06-20T08:10:41Z | |
dc.date.issued | 2018-05-14 | |
dc.description.abstract | Distributed Arctic Observatory (DAO) aims to automate, streamline and improve the collection, storage and analysis of images, video and weather measurements taken on the arctic tundra. Automating the process means that there are no human users that needs to be involved in the process. This leads to a loss of monitoring capabilities of the process. There are insufficient tools that allow the human user to monitor the process and analyze the collected volume of data.
This dissertation presents a prototype of a system to aid researchers in monitoring and analyzing metadata about a dataset. The approach is a system that collects metadata over time, stores it in-memory and visualizes the metadata to a human user.
The architecture comprises three abstractions Dataset, Instrument and Visualization. The Dataset contains metadata. The Instrument extracts the metadata. The Instrument supplies metadata to the Visualization abstraction.
The design comprises a Dataset, Metadata extractor, Dataset server, Web server and Visualization. The Dataset is a file system. The Metadata extractor collects metadata from the dataset. The Dataset server stores the collected metadata. The Web server requests metadata from the dataset server and supplies it to a web browser. The Visualization uses the metadata to create visualizations.
The Metadata extractor is a prototype written in Python and is executed manually as a process. The Dataset server utilizes Redis as an in-memory database and Redis is executed manually as a process. Redis supports a selection of data structures, this enables a logical mapping of metadata. The Web server is implemented using the Django web framework and is served by Gunicorn and Nginx. The Visualization is implemented in JavaScript, mainly utilizing Google Charts to create the visualizations.
A set of experiments was conducted to document performance metrics for the prototype. The results show that we can serve about 2500 web pages to 10 concurrent connections with a latency below 5 ms. The results show that we can store 100 million key-value pairs in 9 GB of memory. Our calculations indicates that it will take over 690 years to reach 9 GB of memory footprint with the current structure of metadata.
This dissertation designs, implements and evaluates an artifact prototype that allow researcher to monitor and analyze metadata about a dataset over time. We contribute an architecture and design that enables and supports the creation of visualizations of organized and processed metadata. The artifact validates using in-memory storage to store the historic metadata. | en_US |
dc.identifier.uri | https://hdl.handle.net/10037/12889 | |
dc.language.iso | eng | en_US |
dc.publisher | UiT Norges arktiske universitet | en_US |
dc.publisher | UiT The Arctic University of Norway | en_US |
dc.rights.accessRights | openAccess | en_US |
dc.rights.holder | Copyright 2018 The Author(s) | |
dc.rights.uri | https://creativecommons.org/licenses/by-nc-sa/3.0 | en_US |
dc.rights | Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0) | en_US |
dc.subject.courseID | INF-3990 | |
dc.subject | VDP::Technology: 500::Information and communication technology: 550::Computer technology: 551 | en_US |
dc.subject | VDP::Teknologi: 500::Informasjons- og kommunikasjonsteknologi: 550::Datateknologi: 551 | en_US |
dc.title | Metadata state and history service for datasets.
Enable extracting, storing and access to metadata about a dataset over time. | en_US |
dc.type | Master thesis | en_US |
dc.type | Mastergradsoppgave | en_US |