Detail publikačního výsledku

Unsupervised Word Segmentation from Speech with Attention

GODARD, P.; BOITO, M.; ONDEL YANG, L.; BERARD, A.; YVON, F.; VILLAVICENCIO, A.; BESACIER, L.

Originální název

Unsupervised Word Segmentation from Speech with Attention

Anglický název

Unsupervised Word Segmentation from Speech with Attention

Druh

Stať ve sborníku v databázi WoS či Scopus

Originální abstrakt

We present a first attempt to perform attentional word segmentationdirectly from the speech signal, with the final goal toautomatically identify lexical units in a low-resource, unwrittenlanguage (UL). Our methodology assumes a pairing betweenrecordings in the UL with translations in a well-resourcedlanguage. It uses Acoustic Unit Discovery (AUD) to convertspeech into a sequence of pseudo-phones that is segmented usingneural soft-alignments produced by a neural machine translationmodel. Evaluation uses an actual Bantu UL, Mboshi;comparisons to monolingual and bilingual baselines illustratethe potential of attentional word segmentation for language documentation.

Anglický abstrakt

We present a first attempt to perform attentional word segmentationdirectly from the speech signal, with the final goal toautomatically identify lexical units in a low-resource, unwrittenlanguage (UL). Our methodology assumes a pairing betweenrecordings in the UL with translations in a well-resourcedlanguage. It uses Acoustic Unit Discovery (AUD) to convertspeech into a sequence of pseudo-phones that is segmented usingneural soft-alignments produced by a neural machine translationmodel. Evaluation uses an actual Bantu UL, Mboshi;comparisons to monolingual and bilingual baselines illustratethe potential of attentional word segmentation for language documentation.

Klíčová slova

computational language documentation,encoder-decoder models, attentional models, unsupervised word segmentation.

Klíčová slova v angličtině

computational language documentation,encoder-decoder models, attentional models, unsupervised word segmentation.

Autoři

GODARD, P.; BOITO, M.; ONDEL YANG, L.; BERARD, A.; YVON, F.; VILLAVICENCIO, A.; BESACIER, L.

Rok RIV

2020

Vydáno

02.09.2018

Nakladatel

International Speech Communication Association

Místo

Hyderabad

Kniha

Proceeding of Interspeech 2018

ISSN

1990-9772

Periodikum

Proceedings of Interspeech

Svazek

2018

Číslo

9

Stát

Francouzská republika

Strany od

2678

Strany do

2682

Strany počet

5

URL

BibTex

@inproceedings{BUT163406,
  author="GODARD, P. and BOITO, M. and ONDEL YANG, L. and BERARD, A. and YVON, F. and VILLAVICENCIO, A. and BESACIER, L.",
  title="Unsupervised Word Segmentation from Speech with Attention",
  booktitle="Proceeding of Interspeech 2018",
  year="2018",
  journal="Proceedings of Interspeech",
  volume="2018",
  number="9",
  pages="2678--2682",
  publisher="International Speech Communication Association",
  address="Hyderabad",
  doi="10.21437/Interspeech.2018-1308",
  issn="1990-9772",
  url="https://www.isca-speech.org/archive/Interspeech_2018/pdfs/1308.pdf"
}

Dokumenty