ub.xmlui.mirage2.page-structure.muninLogoub.xmlui.mirage2.page-structure.openResearchArchiveLogo
    • EnglishEnglish
    • norsknorsk
  • Velg spraakEnglish 
    • EnglishEnglish
    • norsknorsk
  • Administration/UB
View Item 
  •   Home
  • Fakultet for naturvitenskap og teknologi
  • Institutt for fysikk og teknologi
  • Artikler, rapporter og annet (fysikk og teknologi)
  • View Item
  •   Home
  • Fakultet for naturvitenskap og teknologi
  • Institutt for fysikk og teknologi
  • Artikler, rapporter og annet (fysikk og teknologi)
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

PTUS: Photo-Realistic Talking Upper-Body Synthesis via 3D-Aware Motion Decomposition Warping

Permanent link
https://hdl.handle.net/10037/36705
DOI
https://doi.org/10.1609/aaai.v38i4.28131
Thumbnail
View/Open
article.pdf (7.084Mb)
Accepted manuscript version (PDF)
Date
2024-03-24
Type
Journal article
Tidsskriftartikkel
Peer reviewed

Author
Lin, Luoyang; Jiang, Zutao; Liang, Xiaodan; Ma, Liqian; Kampffmeyer, Michael Christian; Cao, Xiaochun
Abstract
Talking upper-body synthesis is a promising task due to its versatile potential for video creation and consists of animating the body and face from a source image with the motion from a given driving video. However, prior synthesis approaches fall short in addressing this task and have been either limited to animating heads of a target person only, or have animated the upper body but neglected the synthesis of precise facial details. To tackle this task, we propose a Photo-realistic Talking Upper-body Synthesis method via 3D-aware motion decomposition warping, named PTUS, to both precisely synthesize the upper body as well as recover the details of the face such as blinking and lip synchronization. In particular, the motion decomposition mechanism consists of a face-body motion decomposition, which decouples the 3D motion estimation of the face and body, and a local-global motion decomposition, which decomposes the 3D face motion into global and local motions resulting in the transfer of facial expression. The 3D-aware warping module transfers the large-scale and subtle 3D motions to the extracted 3D depth-aware features in a coarse-tofine manner. Moreover, we present a new dataset, Talking-UB, which includes upper-body images with high-resolution faces, addressing the limitations of prior datasets that either consist of only facial images or upper-body images with blurry faces. Experimental results demonstrate that our proposed method can synthesize high-quality videos that preserve facial details, and achieves superior results compared to state-of-the-art cross-person motion transfer approaches. Code and collected dataset are released in https://github.com/cooluoluo/PTUS.
Publisher
Association for the Advancement of Artificial Intelligence
Citation
Lin, Jiang, Liang, Ma, Kampffmeyer, Cao. PTUS: Photo-Realistic Talking Upper-Body Synthesis via 3D-Aware Motion Decomposition Warping. Proceedings of the AAAI Conference on Artificial Intelligence. 2024;38(4)
Metadata
Show full item record
Collections
  • Artikler, rapporter og annet (fysikk og teknologi) [1062]
Copyright 2024 The Author(s)

Browse

Browse all of MuninCommunities & CollectionsAuthor listTitlesBy Issue DateBrowse this CollectionAuthor listTitlesBy Issue Date
Login

Statistics

View Usage Statistics
UiT

Munin is powered by DSpace

UiT The Arctic University of Norway
The University Library
uit.no/ub - munin@ub.uit.no

Accessibility statement (Norwegian only)