R&D Result Detail

Original Title

Non-Parametric Bayesian Subspace Models for Acoustic Unit Discovery

English Title

Non-Parametric Bayesian Subspace Models for Acoustic Unit Discovery

Type

WoS Article

Original Abstract

This work investigates subspace non-parametricmodels for the task of learning a set of acoustic units fromunlabeledspeech recordings. We constrain the base-measure of a Dirichlet-Process mixture with a phonetic subspaceestimated from othersource languagesto build an educated prior, thereby forcing thelearned acoustic units to resemble phones of known source languages.Two types of models are proposed: (i) the Subspace HMM(SHMM) which assumes that the phonetic subspace is the same forevery language, (ii) the Hierarchical-Subspace HMM (H-SHMM)which relaxes this assumption and allows to have a languagespecificsubspace estimated on the unlabeled target data. Thesemodels are applied on 3 languages: English, Yoruba and Mboshiand they are compared with various competitive acoustic unitsdiscovery baselines. Experimental results show that both subspacemodels outperform other systems in terms of clustering quality andsegmentation accuracy. Moreover, we observe that the H-SHMMprovides results superior to the SHMM supporting the idea thatlanguage-specific priors are preferable to language-agnostic priorsfor acoustic unit discovery.

English abstract

This work investigates subspace non-parametricmodels for the task of learning a set of acoustic units fromunlabeledspeech recordings. We constrain the base-measure of a Dirichlet-Process mixture with a phonetic subspaceestimated from othersource languagesto build an educated prior, thereby forcing thelearned acoustic units to resemble phones of known source languages.Two types of models are proposed: (i) the Subspace HMM(SHMM) which assumes that the phonetic subspace is the same forevery language, (ii) the Hierarchical-Subspace HMM (H-SHMM)which relaxes this assumption and allows to have a languagespecificsubspace estimated on the unlabeled target data. Thesemodels are applied on 3 languages: English, Yoruba and Mboshiand they are compared with various competitive acoustic unitsdiscovery baselines. Experimental results show that both subspacemodels outperform other systems in terms of clustering quality andsegmentation accuracy. Moreover, we observe that the H-SHMMprovides results superior to the SHMM supporting the idea thatlanguage-specific priors are preferable to language-agnostic priorsfor acoustic unit discovery.

Keywords

Unsupervised learning, non- parametricBayesian models, acoustic unit discovery

Key words in English

Unsupervised learning, non- parametricBayesian models, acoustic unit discovery

Authors

ONDEL YANG, L.; YUSUF, B.; BURGET, L.; SARAÇLAR, M.

RIV year

2023

Released

03.05.2022

ISBN

2329-9290

Periodical

IEEE-ACM Transactions on Audio Speech and Language Processing

Volume

30

Number

5

State

United States of America

Pages from

1902

Pages to

1917

Pages count

16

URL

https://ieeexplore.ieee.org/document/9767690

BibTex

@article{BUT178412,
  author="ONDEL YANG, L. and YUSUF, B. and BURGET, L. and SARAÇLAR, M.",
  title="Non-Parametric Bayesian Subspace Models for Acoustic Unit Discovery",
  journal="IEEE-ACM Transactions on Audio Speech and Language Processing",
  year="2022",
  volume="30",
  number="5",
  pages="1902--1917",
  doi="10.1109/TASLP.2022.3171975",
  issn="2329-9290",
  url="https://ieeexplore.ieee.org/document/9767690"
}

Documents

ondel_ieee_acm-tslp2022_Non-Parametric_Bayesian_Subspace_Models_for_Acoustic_Unit_Discovery

VUT

Faculties and university institutes

Parts

Non-Parametric Bayesian Subspace Models for Acoustic Unit Discovery