Bioinformatician helps archaeogeneticists search for primordial organisms
Miloš Musil from the Department of Information Systems at FIT created a new web application that helps scientists with ancestral reconstruction. He developed the unique tool in collaboration with scientists from the International Clinical Research Centre of St. Anne's University Hospital and researchers from the Loschmidt Laboratories at Masaryk University. There, they use the application to study molecular evolution and search for ancient proteins that no longer exist today. These can help, for example, in pharmacology, medicine or biotechnology.
Archaeogenetics. This is sometimes called ancestral reconstruction - a technique through which scientists investigate traces of the past, much like archaeologists. However, biologists do not look for them at excavation sites, but in computers. They are examining gene sequences and looking for organisms that no longer exist today. The new unique FireProt-ASR tool, developed by Miloš Musil from the Faculty of Information Technology of BUT, will fundamentally help researchers with finding millions-of-years-old proteins from which the current ones have evolved.
"Finding out what the evolution looked like is important not only from a scientific point of view. It is also of great importance in industry today. The further back we follow the evolutionary tree, the closer we get to primordial organisms. And these, as it seems, are often much more stable than the current ones," explains Miloš Musil.
There are several theories as to why this is so. One of them is that in order to survive the inhospitable conditions that prevailed on Earth millions of years ago, organisms simply had to be more resilient. Another theory suggests that proteins had to be more stable in the past in order to survive the large number of mutations to which they were exposed during evolution. "Either way, it really seems that most organisms that currently exist have evolved to function in relatively mild conditions. This is enough for them to survive in nature, but if we want to use them in industry, for example, they need to withstand higher temperatures or perhaps an unfavourable pH level. And in that regard, ancestral proteins are truly more resistant," he adds.
Such findings are then used, for example, in pharmacology, medicine or biotechnology. "A washing powder is a typical example of this. It uses active enzymes to help remove dirt. But the proteins that occur in nature today cannot withstand the temperatures normally used to wash clothes. It is therefore necessary to increase their resilience, and this is precisely what ancestral reconstruction is meant to do. It will lead us to a common ancestor, who is likely to be more stable," describes Miloš Musil.
The web application he has developed will make this much easier for scientists. "In the past, a great deal of expertise was needed in order to put it all together. First, it was necessary to know the biological system, i.e. study the protein in depth and know the family from which the protein comes and how it works. And then it was also necessary to be able to use bioinformatics tools that would help aligning sequences or building a phylogenetic tree, which is crucial for ancestral reconstruction, because it shows which organisms, or proteins, evolved from which," says the Ph.D. student from FIT.
His program, as the only one in the entire world, needs just a single protein sequence as a starting point for the calculation. The rest can be handled by FireProt-ASR, as is the tool made by Brno scientists called. It can save hours or days of work for experienced researchers, or even months of work for those just starting out with the system. "The program is fully automated, so it is also suitable for beginners. Users can also use their own data and start the calculation from different parts of the computing environment," Miloš Musil says about the advantages of the tool.. The tool is currently used by institutions all over the world. "It has already analysed nearly 1,300 proteins. It is freely available at the Loschmidt Laboratories website and is accessible for any use. About two-thirds of its user base are academic users, the rest are businesses," he adds.
By default, the program works with approximately 150 sequences, so the calculation of a phylogenetic tree often takes several hours to complete. Although the theoretical foundations of ancestral reconstruction are over fifty years old, the true potential of this method did not develop until the last decade, which saw the advent of powerful computers. The further development of this technique will probably also depend on computing power.
"We also tried a 620-sequence reconstruction, but the calculation took about two weeks on a fairly powerful computer. And that is still only talking about proteins, which are the product of just a small piece of genetic information. In terms of how the method itself works, however, there is theoretically nothing to prevent us from reconstructing the whole DNA, even DNA of the entire animal kingdom. This would require a huge amount of computing power, which we do not have at our disposal yet," explains Miloš Musil.
Many people may wonder how far this method can go. "It is similar to all other areas of science - it can be used both for good and evil. You can split an atom in a reactor as well as in a nuclear warhead," he says. So could the Jurassic Park become real? "It could. Although I imagine it would be easier and, above all, more accurate to sequence something from a piece of amber, like in the film. It is also possible that this could awaken prehistoric bacteria or viruses which our immune system would not be prepared to handle. But if I wanted to bring about the end of the world, I can imagine an easier way than analysing the teeth of a dinosaur," he says with a smile.
Miloš Musil has been working on ancestral reconstruction and protein stability in the Loschmidt laboratories for six years. He was particularly drawn here by his interest in the natural sciences. "Bioinformatics is at the intersection of two fields, and I find that fascinating. I am glad that I can use IT as a research tool in this area as well," he says. FireProt-ASR, which he has been developing for the last three years, is the central topic of his doctoral thesis, which he will soon defend at the Faculty of Information Technology.
He plans to stay in the laboratories in the future. He and his team of colleagues want to add some additional tools to the program to give experienced users more options for data processing. They also plan to improve other members of the FireProt family - a website used to design stable multiple-point mutants and a database collecting protein stability data. In the future, scientists would like to use this data for machine learning and create a new, more sophisticated system that would help them select mutations for even more stable proteins.