Course detail

Multidimensional Analysis of Biomedical Data

FEKT-MPC-VMMAcad. year: 2024/2025

The course is oriented on commonly used methods in the field of multivariate data analysis: cluster analysis, factor analysis, principal component method, t-SNE, UMAP, etc. Both theoretical (basic principles of each method) and practical (applications in vizualization and analysis of multivariate data) aspects are discussed. The theory is discussed in direct connection with practical examples. All computational techniques are practiced using the Python environment. The course prepares students to independently use the given methods for data analysis in their own scientific or routine work.

 

Language of instruction

Czech

Number of ECTS credits

5

Mode of study

Not applicable.

Entry knowledge

The student should have knowledge of basic statistical data analysis and linear algebra. Knowledge of Python is required during PC excercise.

 

 

Rules for evaluation and completion of the course

1) Team project (max. 20 points):

- Developing an original solution to the team project and defending it at the end of the semester (according to the guidelines)

Notes:

- The completion of the assignment and the quality of the presentation of the results by all team members will be evaluated

- Plagiarism will result in no credit being awarded

- At least one team consultation with the advisor is mandatory!

2) Final exam (max. 80 points):

- oral form

- two parts in total, each for a maximum of 40 points

 

Conditions for credit and admission to the final exam:

- obtaining a non-zero number of points for the team project.

- maximum of two excused absences (in exceptional cases the course supervisor will decide on the solution)

 

Conditions for successful completion of the course:

- obtaining credit

- obtaining at least 20 points in each of the two parts of the exam

- obtaining a total (i.e. from the project and the exam) of at least 50 points

 

Aims

The aim of the course is to provide students with knowledge in the field of multivariate data analysis and to present the possibilities of using selected procedures in the processing and analysis of biomedical data.

The student will acquire basic knowledge and skills in the use of multivariate analysis methods. The student will be able to apply the most commonly used methods in practice in order to process and analyse data.

The examination verifies that the graduate of the course is able to:

- explain the basic concepts of multivariate analysis,

- describe the basic methods in this field, discuss the advantages and disadvantages of each method,

- select and use appropriate tools for a given problem in this area,

- evaluate the quality of obtained results and present them in an appropriate form,

- interpret the obtained results.

 

Study aids

Not applicable.

Prerequisites and corequisites

Not applicable.

Basic literature

M. Meloun, J. Militký: Kompendium statistického zpracování dat, Academia 2006 (CS)
J. Holčík: Analýza a klasifikace dat, CERM 2012 (CS)
D. Haruštiaková, J. Jarkovský, S. Littnerová, L. Dušek: Vícerozměrné statistické metody v biologii, CERM 2012 (CS)
Meloun M. a kol.: Statistická analýza vícerozměrných dat v příkladech, 2017, Karolinum, 978-80-246-3618-4

Recommended reading

M. Kovár: Maticový a tenzorový počet, VUT v Brně (CS)
A. Hyvärinen, J. Karhunen, E. Oja: Independent Component Analysis, Wiley 2001 (CS)

Elearning

Classification of course in study plans

  • Programme MPC-BIO Master's 1 year of study, winter semester, compulsory
  • Programme MPC-BTB Master's 1 year of study, winter semester, compulsory-optional

Type of course unit

 

Lecture

26 hod., optionally

Teacher / Lecturer

Syllabus

1. Introduction to multivariate analysis of biological data. Objectives of multivariate analysis, advantages and disadvantages. Classification of methods.

2. Basics of linear algebra - repetition.

3. Multivariate statistical distributions and tests.

4. Methods of data preprocessing. Types of transformation and standardization. The problem of missing data.

5. Relationship between variables in multidimensional space. Similarity and distance metrics. Correlation, covariance.

6. Cluster analysis of biological data. Hierarchical and non-hierarchical methods. Determination of the optimal number of clusters. Validation of clustering results.

7. Ordination analyses. Overview of methods used in biomedicine.

8. Principal component analysis. Principle of singular matrix decomposition.

9. Factor analysis. The principle of factor analysis. Factor rotation.

10. Nonlinear methods of data dimensionality reduction. The t-SNE method.

11. Nonlinear methods of data dimensionality reduction. UMAP method.

12. Examples of the use of multivariate analysis of biological data.

 

Exercise in computer lab

26 hod., compulsory

Teacher / Lecturer

Syllabus

1. Introduction to Python

2. Exploratory data analysis I: visualization, statistical descriptive analysis

3. Exploratory data analysis II: data processing, correlation analysis

4. Relationships in multidimensional space I

5. Relationships in multidimensional space II

6. Ordination analysis I: PCA

7. Ordination Analysis II: Kernel PCA

8. Cluster analysis I: k-means, UPGMA

9. Cluster analysis II: cluster quality assessment

10. Multivariate data visualization I: t-SNE

11. Multivariate data visualization II: UMAP

 

Elearning