Course detail

Analysis of Biological Sequences

FEKT-MPC-ABSAcad. year: 2021/2022

The subject provides statistical foundations and an overview of the core algorithms of sequence analysis. Topics covered will include background on probability, Hidden Markov Models, and multiple hypothesis testing. Sequence analysis algorithms will include alignment, optimal pairwise local alignment, pairwise global alignment and multiple alignment, gene finding and phylogenetic trees.

Language of instruction

Czech

Number of ECTS credits

6

Mode of study

Not applicable.

Learning outcomes of the course unit

The student will be able to:
- describe basic methods of computer processing of symbolic sequences,
- explain characteristics of DNA and protein evolution,
- describe principle of methods for construction and analysis of fylogenetic trees,
- discus advantages and disadvantages of the methods,
- explain principle of numeric conversion of symbolic biological sequences.

Prerequisites

The student should be able to explain fundamental principles of genetics, should know basic terms and laws of molecular biology and should be oriented in basic knowledge of digital signal processing. In general, knowledge on the Bachelor's degree level is requested.

Co-requisites

Not applicable.

Planned learning activities and teaching methods

Techning methods include lectures and computer laboratories. Course is taking advantage of e-learning (Moodle) system. Students have to write projects/assignments during the course.

Assesment methods and criteria linked to learning outcomes

up to 40 points from computer exercises (3 tests and 1 homework)
up to 60 points from finel written exam
The exam is oriented to verification of orientation in terms of advanced processing of biological sequences, ability to design methods for sequence analysis, apply operations on sequences.

Course curriculum

1. Genetic variability.
2. Models of sequence evolution.
3. Models of protein evolution.
4. Fylogenetic trees.
5. Construction of fylogenetic trees.
6. Evaluation of fylogenetic analysis.
7. Numerical representation of genomic data.
8. Numerical conversion.
9. Description of protein structure.
10. NGS data processing.
11. Metagenomics.

Work placements

Not applicable.

Aims

The aim of the course is to provide knowledge about advanced methods for analysis of biological sequences based on determinsitic as well as stochastic approach. Applications cover pairwise alignment, gene finding and phylogenetic trees.

Specification of controlled education, way of implementation and compensation for absences

Computer exercises are obligatory. Excused absence can be substituted.

Recommended optional programme components

Not applicable.

Prerequisites and corequisites

Not applicable.

Basic literature

Durbin, R. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press, 2002. ISBN: 978-0521629713 (EN)
Rosypal, S. Nový přehled biologie. Scientia, Praha 2003. ISBN 80-7183-268-5 (CS)
Srinivasa, K. G. Statistical Modelling and Machine Learning Principles for Bioinformatics Techniques, Tools, and Applications. Springer, 2020. ISBN 978-9811524448 (EN)
Amjesh, R. Bioinformatics for beginners. LAP LAMBERT Academic Publishing, 2019. ISBN 978-6200262851 (EN)

Recommended reading

Pevzner, P. A. An Introduction to Bioinformatics Algorithms (Computational Molecular Biology. The MIT Press, 2004. ISBN: 978-0262101066 (EN)
Kejnovský, E., Hobza, R. Evoluční genomika, Elportál, Brno: Masarykova univerzita, 2006. ISSN 1802-128X (CS)

eLearning

Classification of course in study plans

  • Programme MPC-BTB Master's, 1. year of study, summer semester, compulsory

Type of course unit

 

Lecture

26 hours, optionally

Teacher / Lecturer

Syllabus

1. Probability concepts in basic molecular biology.
2. Classic and modern pairwise alignment algorithms.
3. Statistical significance of alignment scores and the interpretation of alignment algorithm's output.
4. Mechanism and the use of dynamic programming.
5. Implementation of Needleman-Wunch and Smith-Waterman algorithms.
6. Multiple alignment and phylogenetic reconstruction.
7. Evolution assumed by different models and algorithms.
8. Likelihood approach to phylogenetic reconstruction.
9. Markov models and hidden Markov models (HMM) in the genomic context.
10. Essential algorithms for making inference on HMM.
11. HMMs to gene finding.
12. Other algorithms in gene-finding.
13. Identify important algorithmic/statistical advances in bioinformatics that address biologically important questions.

Exercise in computer lab

26 hours, compulsory

Teacher / Lecturer

Syllabus

1. Classical and Bayes probability.
2. Pairwise alignment algorithms.
3. Computing alignment scores and the interpretation of alignment algorithm's output.
4. Algorithms for dynamic programming.
5. Implementation of Needleman-Wunch and Smith-Waterman algorithms.
6. Multiple alignment.
7. Tracking sequence evolution.
8. Phylogenetic reconstruction.
9. Markov models in the genomic context.
10. Hidden Markov models in the genomic context.
11. HMMs to gene finding I.
12. HMMs to gene finding II.
13. Other algorithms in gene-finding.

Project

13 hours, compulsory

Teacher / Lecturer

Syllabus

Individual projects from the area of analysis of biological sequences.

eLearning