Detail publikačního výsledku

Learning document representations using subspace multinomial model

KESIRAJU, S.; BURGET, L.; SZŐKE, I.; ČERNOCKÝ, J.

Originální název

Learning document representations using subspace multinomial model

Anglický název

Learning document representations using subspace multinomial model

Druh

Stať ve sborníku v databázi WoS či Scopus

Originální abstrakt

Subspace multinomial model (SMM) is a log-linear model andcan be used for learning low dimensional continuous representationfor discrete data. SMMand its variants have been used forspeaker verification based on prosodic features and phonotacticlanguage recognition. In this paper, we propose a new variantof SMM that introduces sparsity and call the resulting modelas `1 SMM. We show that `1 SMM can be used for learningdocument representations that are helpful in topic identificationor classification and clustering tasks. Our experiments in documentclassification show that SMM achieves comparable resultsto models such as latent Dirichlet allocation and sparse topicalcoding, while having a useful property that the resulting documentvectors are Gaussian distributed.

Anglický abstrakt

Subspace multinomial model (SMM) is a log-linear model andcan be used for learning low dimensional continuous representationfor discrete data. SMMand its variants have been used forspeaker verification based on prosodic features and phonotacticlanguage recognition. In this paper, we propose a new variantof SMM that introduces sparsity and call the resulting modelas `1 SMM. We show that `1 SMM can be used for learningdocument representations that are helpful in topic identificationor classification and clustering tasks. Our experiments in documentclassification show that SMM achieves comparable resultsto models such as latent Dirichlet allocation and sparse topicalcoding, while having a useful property that the resulting documentvectors are Gaussian distributed.

Klíčová slova

Document representation, subspace modelling,topic identification, latent topic discovery

Klíčová slova v angličtině

Document representation, subspace modelling,topic identification, latent topic discovery

Autoři

KESIRAJU, S.; BURGET, L.; SZŐKE, I.; ČERNOCKÝ, J.

Rok RIV

2017

Vydáno

08.09.2016

Nakladatel

International Speech Communication Association

Místo

San Francisco

ISBN

978-1-5108-3313-5

Kniha

Proceedings of Interspeech 2016

Strany od

700

Strany do

704

Strany počet

5

URL

BibTex

@inproceedings{BUT132598,
  author="Santosh {Kesiraju} and Lukáš {Burget} and Igor {Szőke} and Jan {Černocký}",
  title="Learning document representations using subspace multinomial model",
  booktitle="Proceedings of Interspeech 2016",
  year="2016",
  pages="700--704",
  publisher="International Speech Communication Association",
  address="San Francisco",
  doi="10.21437/Interspeech.2016-1634",
  isbn="978-1-5108-3313-5",
  url="https://www.researchgate.net/publication/307889473_Learning_Document_Representations_Using_Subspace_Multinomial_Model"
}

Dokumenty