Detail publikačního výsledku

STOPA: A Database of Systematic VariaTion Of DeePfake Audio for Source Tracing and Attribution

FIRC, A.; CHHIBBER, M.; MISHRA, J.; SINGH, V.; KINNUNEN, T.; MALINKA, K.

Originální název

STOPA: A Database of Systematic VariaTion Of DeePfake Audio for Source Tracing and Attribution

Anglický název

STOPA: A Database of Systematic VariaTion Of DeePfake Audio for Source Tracing and Attribution

Druh

Stať ve sborníku v databázi WoS či Scopus

Originální abstrakt

A key research area in deepfake speech detection is source tracing - determining the origin of synthesised utterances. The approaches may involve identifying the acoustic model (AM), vocoder model (VM), or other generation-specific parameters. However, progress is limited by the lack of a dedicated, systematically curated dataset. To address this, we introduce STOPA, a systematically varied and metadata-rich dataset for deepfake speech source tracing, covering 8 AMs, 6 VMs, and diverse parameter settings across 700k samples from 13 distinct synthesisers. Unlike existing datasets, which often feature limited variation or sparse metadata, STOPA provides a systematically controlled framework covering a broader range of generative factors, such as the choice of the vocoder model, acoustic model, or pretrained weights, ensuring higher attribution reliability. This control improves attribution accuracy, aiding forensic analysis, deepfake detection, and generative model transparency.

Anglický abstrakt

A key research area in deepfake speech detection is source tracing - determining the origin of synthesised utterances. The approaches may involve identifying the acoustic model (AM), vocoder model (VM), or other generation-specific parameters. However, progress is limited by the lack of a dedicated, systematically curated dataset. To address this, we introduce STOPA, a systematically varied and metadata-rich dataset for deepfake speech source tracing, covering 8 AMs, 6 VMs, and diverse parameter settings across 700k samples from 13 distinct synthesisers. Unlike existing datasets, which often feature limited variation or sparse metadata, STOPA provides a systematically controlled framework covering a broader range of generative factors, such as the choice of the vocoder model, acoustic model, or pretrained weights, ensuring higher attribution reliability. This control improves attribution accuracy, aiding forensic analysis, deepfake detection, and generative model transparency.

Klíčová slova

source tracing, dataset, anti-spoofing, synthetic speech, deepfake

Klíčová slova v angličtině

source tracing, dataset, anti-spoofing, synthetic speech, deepfake

Autoři

FIRC, A.; CHHIBBER, M.; MISHRA, J.; SINGH, V.; KINNUNEN, T.; MALINKA, K.

Vydáno

17.08.2025

Nakladatel

International Speech Communication Association

Místo

Rotterdam, The Netherlands

Kniha

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2025

Periodikum

Interspeech

Stát

Nizozemsko

Strany od

1553

Strany do

1557

Strany počet

5

URL

BibTex

@inproceedings{BUT196844,
  author="Anton {Firc} and  {} and  {} and  {} and Tomi {Kinnunen} and Kamil {Malinka}",
  title="STOPA: A Database of Systematic VariaTion Of DeePfake Audio for Source Tracing and Attribution",
  booktitle="Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2025",
  year="2025",
  journal="Interspeech",
  pages="1553--1557",
  publisher="International Speech Communication Association",
  address="Rotterdam, The Netherlands",
  doi="10.21437/Interspeech.2025-2065",
  url="https://www.isca-archive.org/interspeech_2025/firc25_interspeech.html"
}