ub.xmlui.mirage2.page-structure.muninLogoub.xmlui.mirage2.page-structure.openResearchArchiveLogo
    • EnglishEnglish
    • norsknorsk
  • Velg spraaknorsk 
    • EnglishEnglish
    • norsknorsk
  • Administrasjon/UB
Vis innførsel 
  •   Hjem
  • Fakultet for naturvitenskap og teknologi
  • Institutt for fysikk og teknologi
  • Artikler, rapporter og annet (fysikk og teknologi)
  • Vis innførsel
  •   Hjem
  • Fakultet for naturvitenskap og teknologi
  • Institutt for fysikk og teknologi
  • Artikler, rapporter og annet (fysikk og teknologi)
  • Vis innførsel
JavaScript is disabled for your browser. Some features of this site may not work without it.

Dilated temporal relational adversarial network for generic video summarization

Permanent lenke
https://hdl.handle.net/10037/17624
DOI
https://doi.org/10.1007/s11042-019-08175-y
Thumbnail
Åpne
article.pdf (12.57Mb)
Akseptert manusversjon (PDF)
Dato
2019-10-12
Type
Journal article
Tidsskriftartikkel
Peer reviewed

Forfatter
Zhang, Yujia; Kampffmeyer, Michael C.; Liang, Xiaodan; Zhang, Dingwen; Tan, Min; Xing, Eric P.
Sammendrag
The large amount of videos popping up every day, make it more and more critical that key information within videos can be extracted and understood in a very short time. Video summarization, the task of finding the smallest subset of frames, which still conveys the whole story of a given video, is thus of great significance to improve efficiency of video understanding. We propose a novel Dilated Temporal Relational Generative Adversarial Network (DTR-GAN) to achieve frame-level video summarization. Given a video, it selects the set of key frames, which contain the most meaningful and compact information. Specifically, DTR-GAN learns a dilated temporal relational generator and a discriminator with three-player loss in an adversarial manner. A new dilated temporal relation (DTR) unit is introduced to enhance temporal representation capturing. The generator uses this unit to effectively exploit global multi-scale temporal context to select key frames and to complement the commonly used Bi-LSTM. To ensure that summaries capture enough key video representation from a global perspective rather than a trivial randomly shorten sequence, we present a discriminator that learns to enforce both the information completeness and compactness of summaries via a three-player loss. The loss includes the generated summary loss, the random summary loss, and the real summary (ground-truth) loss, which play important roles for better regularizing the learned model to obtain useful summaries. Comprehensive experiments on three public datasets show the effectiveness of the proposed approach.
Beskrivelse
This is a post-peer-review, pre-copyedit version of an article published in Multimedia Tools and Applications. The final authenticated version is available online at: http://dx.doi.org/https://doi.org/10.1007/s11042-019-08175-y.
Forlag
Springer Nature
Sitering
Zhang Y, Kampffmeyer MC, Liang X, Zhang, Tan M, Xing EP. Dilated temporal relational adversarial network for generic video summarization. Multimedia tools and applications. 2019;78(24):35237-35261
Metadata
Vis full innførsel
Samlinger
  • Artikler, rapporter og annet (fysikk og teknologi) [1057]
Copyright © 2019, Springer Nature

Bla

Bla i hele MuninEnheter og samlingerForfatterlisteTittelDatoBla i denne samlingenForfatterlisteTittelDato
Logg inn

Statistikk

Antall visninger
UiT

Munin bygger på DSpace

UiT Norges Arktiske Universitet
Universitetsbiblioteket
uit.no/ub - munin@ub.uit.no

Tilgjengelighetserklæring