ub.xmlui.mirage2.page-structure.muninLogoub.xmlui.mirage2.page-structure.openResearchArchiveLogo
    • EnglishEnglish
    • norsknorsk
  • Velg spraaknorsk 
    • EnglishEnglish
    • norsknorsk
  • Administrasjon/UB
Vis innførsel 
  •   Hjem
  • Fakultet for ingeniørvitenskap og teknologi
  • Institutt for elektroteknologi
  • Artikler, rapporter og annet (elektroteknologi)
  • Vis innførsel
  •   Hjem
  • Fakultet for ingeniørvitenskap og teknologi
  • Institutt for elektroteknologi
  • Artikler, rapporter og annet (elektroteknologi)
  • Vis innførsel
JavaScript is disabled for your browser. Some features of this site may not work without it.

A Two-Stage Deep Modeling Approach to Articulatory Inversion

Permanent lenke
https://hdl.handle.net/10037/31359
DOI
https://doi.org/10.1109/ICASSP39728.2021.9413742
Thumbnail
Åpne
article.pdf (328.1Kb)
Akseptert manusversjon (PDF)
Dato
2021-05-13
Type
Chapter
Bokkapittel

Forfatter
Sabzi Shahrebabaki, Abdolreza; Olfati, Negar; Imran, Ali Shariq; Johnsen, Magne Hallstein; Siniscalchi, Sabato Marco; Svendsen, Torbjørn Karl
Sammendrag
This paper proposes a two-stage deep feed-forward neural network (DNN) to tackle the acoustic-to-articulatory inversion (AAI) problem. DNNs are a viable solution for the AAI task, but the temporal continuity of the estimated articulatory values has not been exploited properly when a DNN is employed. In this work, we propose to address the lack of any temporal constraints while enforcing a parameter-parsimonious solution by deploying a two-stage solution based only on DNNs: (i) Articulatory trajectories are estimated in a first stage using DNN, and (ii) a temporal window of the estimated trajectories is used in a follow-up DNN stage as a refinement. The first stage estimation could be thought of as an auxiliary additional information that poses some constraints on the inversion process. Experimental evidence demonstrates an average error reduction of 7.51% in terms of RMSE compared to the baseline, and an improvement of 2.39% with respect to Pearson correlation is also attained. Finally, we should point out that AAI is still a highly challenging problem, mainly due to the non-linearity of the acoustic-to-articulatory and one-to-many mapping. It is thus promising that a significant improvement was attained with our simple yet elegant solution.
Forlag
IEEE
Sitering
Sabzi Shahrebabaki, Olfati, Imran, Johnsen, Siniscalchi, Svendsen: A Two-Stage Deep Modeling Approach to Articulatory Inversion. In: Androutsos, Plataniotis K, Zhang X. ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , 2021. IEEE
Metadata
Vis full innførsel
Samlinger
  • Artikler, rapporter og annet (elektroteknologi) [127]
Copyright 2021 The Author(s)

Bla

Bla i hele MuninEnheter og samlingerForfatterlisteTittelDatoBla i denne samlingenForfatterlisteTittelDato
Logg inn

Statistikk

Antall visninger
UiT

Munin bygger på DSpace

UiT Norges Arktiske Universitet
Universitetsbiblioteket
uit.no/ub - munin@ub.uit.no

Tilgjengelighetserklæring