Course detail
Speech Signal Processing (in English)
FIT-ZREeAcad. year: 2023/2024
Aplikace počítačového zpracování řeči, číslicové zpracování řečových signálů, tvorba a slyšení řeči, úvod do fonetiky, předzpracování a základní parametry, lineárně-prediktivní model, cepstrum, určování základního tónu hlasu, kódování - časová oblast a vokodéry, rozpoznávání - DTW a HMM, syntéza. Software a knihovny pro zpracování řeči.
Language of instruction
Number of ECTS credits
Mode of study
Guarantor
Offered to foreign students
Entry knowledge
Rules for evaluation and completion of the course
- mid-term test
- presentation of projects
- presentation of results in computer labs
Aims
The students will get familiar with basic characteristics of speech signal in relation to production and hearing of speech by humans. They will understand basic algorithms of speech analysis common to many applications. They will be given an overview of applications (recognition, synthesis, coding) and be informed about practical aspects of speech algorithms implementation. The students will be able to design a simple system for speech processing (speech activity detector, recognizer of limited number of isolated words), including its implementation into application programs.
Study aids
Prerequisites and corequisites
Basic literature
Recommended reading
Psutka, J.: Komunikace s počítačem mluvenou řečí. Academia, Praha, 1995, ISBN 80-200-0203-0 (EN)
(EN)
Elearning
Classification of course in study plans
Type of course unit
Lecture
Teacher / Lecturer
Syllabus
- Introduction, applications of speech processing, sciences relevant for SP, informational content of speech.
- Digital processing of speech signals.
- Speech production and perception, basic notions from psycho-acoustics, applications in speech processing.
- Introduction to phonetics, international norms for phoneme mark-up.
- Pre-processing and basic parameters of speech.
- Linear-predictive model, spectrum using LP, applications of LP.
- Cepstral analysis, Mel-frequency cepstrum.
- Determination of fundamental frequency.
- Speech coding
- Speech recognition - dynamic programming DTW, hidden Markov models HMM
- Speech synthesis
- Software and libraries for speech processing.
Exercise in computer lab
Teacher / Lecturer
Syllabus
- Except the last one, Matlab is used in labs.
- Frames, windows, spectrum, pre-processing.
- Linear prediction (LPC).
- Fundamental frequency estimation.
- Coding.
- Recognition - Dynamic time Warping (DTW).
- Recognition - hidden Markov models (Hidden Markov Model Toolkit - HTK).
Elearning