Improving freshness of web-extracted metadata
Permanent lenke
https://hdl.handle.net/10037/2400Dato
2009-12-18Type
Master thesisMastergradsoppgave
Forfatter
Heimdal, Tord-ArneSammendrag
Live video search is emerging as a platform for multimedia production and entertainment service.
Such systems rely on a stream of live video and metadata describing the video content.
A high quality source for such metadata can be found on the web.
Identifying and extracting metadata from web pages can be done by crawling and scraping.
However, general crawler politeness rules limit per-site polling frequency, and therefore the freshness of the retrieved data is also limited.
% our solution
In this thesis we present a metadata extraction system capable of combining high metadata freshness, while at the same time adhering to polling politeness rules.
To achieve this, the proposed solution uses a pool of web sources containing overlapping information scheduled in a round-robin fashion.
% evaluation
Our experiments and analysis show that our system is capable of keeping the average metadata freshness higher than any single-source solution, while at the same time adhere to polling politeness rules.
Forlag
Universitetet i TromsøUniversity of Tromsø
Metadata
Vis full innførselSamlinger
Copyright 2009 The Author(s)
Følgende lisensfil er knyttet til denne innførselen: