Course detail
Speech Signal Analysis and Synthesis
FEKT-LASSAcad. year: 2010/2011
Phonetic description of the Czech language, signal windowing, preemphasis, pitch estimation. Representations of a speech in the time and frequency domains, short-time analysis of speech signal, selection of suitable features, word endpoints detection, linear and nonlinear time warping, isolated word recognition system, connected word recognition, suitable speech units and features for speaker recognition, hidden Markov models, speaker identification, speaker verification, speech synthesis, vocoders, some typical applications of speech and speaker recognition.
Language of instruction
Number of ECTS credits
Mode of study
Guarantor
Department
Learning outcomes of the course unit
Prerequisites
Co-requisites
Planned learning activities and teaching methods
Assesment methods and criteria linked to learning outcomes
Course curriculum
Vocal tract model, phonetic description of Czech language.
Preprocessing of speech signal: windowing, preemphasis.
Energy, zero-crossing rate and autocorrelation function.
Linear prediction coding and derived coefficients.
Cepstral analysis of speech signal.
Estimation of fundamental speech frequency.
Linear and nonlinear time alignments.
Deterministical and statistical classificators, hidden Markov models.
Classificators learning, error rate estimation.
Voice recognition, speaker verification and identification.
Speech synthesis methods.
Speech coding and transmission, basic types of vocoders.
Work placements
Aims
Specification of controlled education, way of implementation and compensation for absences
Recommended optional programme components
Prerequisites and corequisites
Basic literature
SIGMUND,M. Analýza řečových signálů. Skriptum FEKT VUT, Brno 2000.
Recommended reading
Classification of course in study plans
Type of course unit
Lecture
Teacher / Lecturer
Syllabus
02 Vocal tract model, phonetic description of Czech language.
03 Preprocessing of speech signal: windowing, preemphasis.
04 Energy, zero-crossing rate and autocorrelation function.
05 Linear prediction coding and derived coefficients.
06 Cepstral analysis of speech signal.
07 Estimation of fundamental speech frequency.
08 Linear and nonlinear time alignments.
09 Deterministical and statistical classificators, hidden Markov models.
10 Classificators learning, error rate estimation.
11 Voice recognition, speaker verification and identification.
12 Speech synthesis methods.
13 Speech coding and transmission, basic types of vocoders.
Exercise in computer lab
Teacher / Lecturer
Syllabus
Spectrum of typical vowel sounds, formant frequencies.
Spectrum analysis using Hamming and rectangular window.
Short-time energy and zero-crossings for(un)voiced speech.
Detection of speech/pause and word boundaries.
Linear prediction of speech waveform and derived spectra.
Transformations between speech features.
Correlations between various speech signal parameters.
Calculation of several distances between speech frames.
Automatic recognition of an unknown word.
Segmentation of a word string into phonetic units.
Measuring of fundamental frequency by Center-Clipping.
Cepstral analysis for voiced speech.
Identification of different speakers.