Project detail

Soudobé metody zpracování, analýzy a zobrazování multimediálních a 3D dat

Duration: 1.3.2023 — 28.2.2026

Funding resources

Vysoké učení technické v Brně - Vnitřní projekty VUT

On the project

Multimediální a 3D data jsou důležitými a potřebnými daty pro vzrůstající počet aplikací moderních počítačových systémů, v nichž je jejich využití nenahraditelné. Současně je známo, že zpracování takových dat je obtížné a výpočetně náročné a to platí i o jejich zobrazování a analýze. Proto je výzkum v této oblasti jedním z obtížnějších a důležitých. Projekt navazuje na dřívější projekt "Moderní metody zpracování, analýzy a zobrazování multimediálních a 3D dat".

Mark

FIT-S-23-8278

Default language

Czech

People responsible

Units

Department of Computer Graphics and Multimedia
- internal (1.1.2023 - 31.12.2025)
Faculty of Information Technology
- beneficiary (1.1.2023 - 31.12.2025)

Results

ČADÍK, M.; ČADÍK, M.; ČADÍK, M.; ČADÍK, M.: HiVisComp 2025, High Visual Computing 2025. Krkonoše Tetřeví Boudy https://www.tetreviboudy.com/ (29.01.2025)
Detail

BURDISSO, S.; SÁNCHEZ-CORTÉS, D.; VILLATORO-TELLO, E.; MOTLÍČEK, P. Reliability Estimation of News Media Sources: Birds of a Feather Flock Together. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics Human Language Technologies Naacl 2024. Mexico City, Mexico: Association for Computational Linguistics (ACL), 2024. p. 6893-6911. ISBN: 979-8-89176-114-8.
Detail

ZEINALI, H.; LEE, K.; ALAM, J.; BURGET, L. Text-dependent Speaker Verification Challenge 2024: Exploring Shared and User-defined Passphrases. In 2025 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP). Proceedings of the ... IEEE International Conference on Acoustics, Speech, and Signal Processing. Hyderabad, Indická republika: Institute of Electrical and Electronics Engineers Inc., 2025. p. 1-5.
Detail

CHLUBNA, T.: Light Field Video Streaming on GPU. URL: https://github.com/ichlubna/lfStreaming. (Software)
Detail

BHATTACHARJEE, M.; MOTLÍČEK, P.; MADIKERI, S.; HELMKE, H.; OHNEISER, O.; KLEINERT, M.; EHR, H. Minimum effort adaptation of automatic speech recognition system in air traffic management. European Journal of Transport and Infrastructure Research, 2024, vol. 24, iss. 4, p. 133-153.
Detail

MA, X.; ZHANG, R.; WEI, J.; LU, X.; XU, J.; ZHANG, L.; LU, W. Self-distillation-based domain exploration for source speaker verification under spoofed speech from unknown voice conversion. Speech communication, 2025, vol. 167, iss. 103153, p. 1-12.
Detail

CHEN, X.; LU, W.; ZHANG, R.; XU, J.; LU, X.; ZHANG, L.; WEI, J. Continual Unsupervised Domain Adaptation for Audio Deepfake Detection. In Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing. Hyderabad, Indická republika: Institute of Electrical and Electronics Engineers Inc., 2025. p. 1-5. ISBN: 979-8-3503-6874-1.
Detail

CHLUBNA, T.: Looking Glass calibration fetching tool. URL: https://github.com/ichlubna/getLKGCalibration. (Software)
Detail

CHLUBNA, T.: OpenGL Injector. URL: https://github.com/ichlubna/OpenGLInjector. (Software)
Detail

CAROFILIS, A.; RANGAPPA, P.; MADIKERI, S.; KUMAR, S.; BURDISSO, S.; PRAKASH, J.; VILLATORO-TELLO, E.; MOTLÍČEK, P.; SHARMA, B.; HACIOGLU, K.; VENKATESAN, S.; VYAS, S.; STOLCKE, A. Better Semi-supervised Learning for Multi-domain ASR Through Incremental Retraining and Data Filtering. In Interspeech. Interspeech. Rotterdam, The Netherlands: Isca-Int Speech Communication Assoc, 2025. p. 3618-3622.
Detail

YAN, B.; HAMED, I.; SHIMIZU, S.; LODAGALA, V.; CHEN, W.; IAKOVENKO, O.; TALAFHA, B.; HUSSEIN, A.; POLOK, A.; CHANG, K.; KLEMENT, D.; ALTHUBAITI, S.; PENG, P.; WIESNER, M.; SOLORIO, T.; ALI, A.; KHUDANPUR, S.; WATANABE, S. CS-FLEURS: A Massively Multilingual and Code-Switched Speech Dataset. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Interspeech. Rotterdam, Nizozemí: ISCA, 2025. p. 743-747.
Detail

GRAUMAN, K.; WESTBURY, A.; BYRNE, E.; CARTILLIER, V.; CHAVIS, Z.; FURNARI, A.; GIRDHAR, R.; HAMBURGER, J.; JIANG, H.; KUKREJA, D.; LIU, M.; LIU, X.; MARTIN, M.; NAGARAJAN, T.; RADOSAVOVIC, I.; RAMAKRISHNAN, S.; RYAN, F.; SHARMA, J.; WRAY, M.; XU, M.; XU, E.; ZHAO, C.; BANSAL, S.; BATRA, D.; CRANE, S.; DO, T.; DOULATY, M.; ERAPALLI, A.; FEICHTENHOFER, C.; FRAGOMENI, A.; FU, Q.; GEBRESELASIE, A.; GONZALEZ, C.; HILLIS, J.; HUANG, X.; HUANG, Y.; JIA, W.; KHOO, W.; KOLAR, J.; KOTTUR, S.; KUMAR, A.; LANDINI, F.; LI, C.; LI, Y.; LI, Z.; MANGALAM, K.; MODHUGU, R.; MUNRO, J.; MURRELL, T.; NISHIYASU, T.; PRICE, W.; RUIZ PUENTES, P.; RAMAZANOVA, M.; SARI, L.; SOMASUNDARAM, K.; SOUTHERLAND, A.; SUGANO, Y.; TAO, R.; VO, M.; WANG, Y.; WU, X.; YAGI, T.; ZHAO, Z.; ZHU, Y.; ARBELAEZ, P.; CRANDALL, D.; DAMEN, D.; FARINELLA, G.; FUEGEN, C.; GHANEM, B.; KRISHNA, V.; JAWAHAR, C.; JOO, H.; KITANI, K.; LI, H.; NEWCOMBE, R.; OLIVA, A.; PARK, H.; REHG, J.; SATO, Y.; SHI, J.; ZHENG SHOU, M.; TORRALBA, A.; TORRESANI, L.; YAN, M.; MALIK, J. Ego4D: Around the World in 3,600 Hours of Egocentric Video. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025, vol. 47, iss. 11, p. 9468-9509.
Detail

SANCHEZ-CORTES, D.; BURDISSO, S.; VILLATORO-TELLO, E.; MOTLÍČEK, P. Mapping the Media Landscape: Predicting Factual Reporting and Political Bias Through Web Interactions. In Lecture Notes in Computer Science. Lecture Notes in Computer Science. CHAM: Springer Nature, 2024. p. 127-138. ISBN: 978-3-031-71735-2.
Detail

TIAN, J.; SHI, J.; CHEN, W.; ARORA, S.; MASUYAMA, Y.; MAEKAKU, T.; WU, Y.; PENG, J.; BHARADWAJ, S.; ZHAO, Y.; CORNELL, S.; PENG, Y.; YUE, X.; YANG, C.; NEUBIG, G.; WATANABE, S. ESPnet-SpeechLM: An Open Speech Language Model Toolkit. In Proceedings of the 2025 Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies: Long Papers, NAACL-HLT 2025. Hybrid, Albuquerque, New Mexico, USA: Association for Computational Linguistics (ACL), 2025. p. 116-124. ISBN: 9798891761919.
Detail

CHEN, X.; LIN, I.; ZHANG, L.; DU, J.; WU, H.; LEE, H.; JANG, J. Codec-Based Deepfake Source Tracing via Neural Audio Codec Taxonomy. In Proceedings of the Annual Conference of the International Speech Communication Association Interspeech. Interspeech. Rotterdam, Nizozemí: International Speech Communication Association, 2025. p. 1538-1542.
Detail

ZHANG, R.; WEI, J.; LU, X.; ZHANG, L.; JIN, D.; XU, J.; LU, W. SHDA: Sinkhorn Domain Attention for Cross-Domain Audio Anti-Spoofing. IEEE Transactions on Information Forensics and Security, 2025, iss. 20, p. 6474-6489.
Detail

CHLUBNA, T.: Generator of focus maps from holographic data. URL: https://github.com/ichlubna/quiltFocus. (Software)
Detail

CHLUBNA, T.: vkCompViz: Universal C++ Library for GPU-Based Experiments. URL: https://github.com/ichlubna/vkCompViz. (Software)
Detail

CHLUBNA, T.: lfFocusMaps: Lightweight All-Focused Light Field Rendering. URL: https://github.com/ichlubna/lfFocusMaps. (Software)
Detail

CHLUBNA, T.: Converter of the quilt image into the native Looking Glass format. URL: https://github.com/ichlubna/quiltToNative. (Software)
Detail

KUBÍK, T.; KODYM, O.; ŠILLING, P.; TRÁVNÍČKOVÁ, K.; MOJŽIŠ, T.; MATULA, J. Leveraging Point Transformers for Detecting Anatomical Landmarks in Digital Dentistry. In Lecture Notes in Computer Science. Lecture Notes in Computer Science. Springer Science and Business Media Deutschland GmbH, 2025. iss. 15571 LNCS, p. 216-228. ISBN: 9783031889769.
Detail

CHLUBNA, T.; MILET, T.; ZEMČÍK, P. Light Field Video Streaming on GPU. SIGNAL PROCESSING-IMAGE COMMUNICATION, 2025, vol. 2025, iss. 138, p. 0-0.
Detail

ŠPAŇHEL, J.; ZEMČÍK, P.; BERAN, V.; HEROUT, A.: CloudTrafficAnalyser - Software pro zpracování dopravních dat z více zdrojů. URL: https://github.com/BUT-GRAPH-at-FIT/SASTRASAT-CloudTrafficAnalyser. (Software)
Detail

CHLUBNA, T. vkCompViz: Universal C++ Library for GPU-Based Experiments. Journal of open source software, 2026, vol. 11, iss. 117, p. 1-5.
Detail

CHLUBNA, T.; VLNAS, M.; MILET, T.; ZEMČÍK, P. Survey of FOSS 3D/2D Graphics Software Blender Usage in Science, Academia, and Industry. The visual computer, 2025, vol. 42, iss. 1, p. 1-32.
Detail

SEBUYOYA, R.; SEVCIKOVA, S.; YUSUF, B.; BARTOSIK, M. Integrating isothermal amplification techniques and LNA-based AI-assisted electrochemical bioassay for analysis of KRAS G12V point mutation. TALANTA, 2025, vol. 127709, iss. 288, p. 1-10.
Detail

ČADÍK, M.; ČADÍK, M.; ČADÍK, M.; ČADÍK, M.: HiVisComp 2024, High Visual Computing 2024. Resort Dlouhé Stráně, Rejhotice 72 788 11 Loučná nad Desnou Czech Republi (24.01.2024)
Detail

ČADÍK, M.; ČADÍK, M.; ČADÍK, M.; ČADÍK, M.: HiVisComp 2023, High Visual Computing 2023. Hotel Boboty, Vrátna 515, 013 06 Terchová, Slovensko (01.02.2023)
Detail

BAŘINA, D.: Minimalist JPEG decoder & encoder. URL: http://www.fit.vutbr.cz/research/prod/?id=814. (Software)
Detail

BAŘINA, D.: FMM: Fast Matrix Multiplication. URL: http://www.fit.vutbr.cz/research/prod/?id=871. (Software)
Detail

ŠPAŇHEL, J.; BERAN, V.; HEROUT, A.; ZEMČÍK, P.: Metody zpracování dat pro dopravní účely. URL: https://git.fit.vutbr.cz/SaSTraSAT/TrafficAnalyser. (Software)
Detail

BAŘINA, D.: x3: Experimental Data Compressor. URL: http://www.fit.vutbr.cz/research/prod/?id=827. (Software)
Detail

BAŘINA, D.: Convergence verification of the Collatz problem. URL: http://www.fit.vutbr.cz/research/prod/?id=828. (Software)
Detail

KLÍMA, O.; NEUBAUER, J.; POLCEROVÁ, L.; KRÁLÍK, M.; ZEMAN, T.: KSPredict: Software pro predikci vývoje krizových situací a mimořádných událostí. URL: https://github.com/ondrej-klima/shinyfireweather. (Software)
Detail

KIŠŠ, M.; HRADIŠ, M.; BENEŠ, K.; BUCHAL, P.; KULA, M. SoftCTC-semi-supervised learning for text recognition using soft pseudo-labels. International Journal on Document Analysis and Recognition, 2023, vol. 2024, iss. 27, p. 177-193. ISSN: 1433-2825.
Detail

NOVÁK, J.; CHUDÝ, P.; HANÁK, J. Model Predictive Control Driven Aerial Grasping with Soft Operational Constraints. In ICAS Proceedings. ICAS proceedings. Florence: International Council of the Aeronautical Sciences, 2024. iss. 10, p. 1-15. ISSN: 2958-4647.
Detail

PRASAD, A.; MADIKERI, S.; KHALIL, D.; MOTLÍČEK, P.; SCHUEPBACH, C. Speech and Language Recognition with Low-rank Adaptation of Pretrained Models. In Proceedings of Interspeech. Proceedings of Interspeech. Kos Island: International Speech Communication Association, 2024. iss. 9, p. 2825-2829. ISSN: 1990-9772.
Detail

PRASAD, A.; CAROFILIS, A.; VANDERREYDT, G.; KHALIL, D.; MADIKERI, S.; MOTLÍČEK, P.; SCHUEPBACH, C. Fine-Tuning Self-Supervised Models for Language Identification Using Orthonormal Constraint. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Seoul: IEEE Signal Processing Society, 2024. p. 11921-11925. ISBN: 979-8-3503-4485-1.
Detail

BHATTACHARJEE, M.; NIGMATULINA, I.; PRASAD, A.; RANGAPPA, P.; MADIKERI, S.; MOTLÍČEK, P.; HELMKE, H.; KLEINERT, M. Contextual Biasing Methods for Improving Rare Word Detection in Automatic Speech Recognition. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Seoul: IEEE Signal Processing Society, 2024. p. 12652-12656. ISBN: 979-8-3503-4485-1.
Detail

CHLUBNA, T.; MILET, T.; ZEMČÍK, P.; KULA, M. Real-Time Light Field Video Focusing and GPU Accelerated Streaming. Journal of Signal Processing Systems for Signal Image and Video Technology, 2023, vol. 95, iss. 6, p. 703-719. ISSN: 1939-8115.
Detail

ZULUAGA-GOMEZ, J.; VESELÝ, K.; SZŐKE, I.; BLATT, A.; MOTLÍČEK, P.; KOCOUR, M.; RIGAULT, M.; CHOUKRI, K.; PRASAD, A.; SARFJOO, S.; NIGMATULINA, I.; CEVENINI, C.; KOLČÁREK, P.; TART, A.; ČERNOCKÝ, J.; KLAKOW, D. ATCO2 corpus: A Large-Scale Dataset for Research on Automatic Speech Recognition and Natural Language Understanding of Air Traffic Control Communications has been verified and confirmed by the Action Editor. Journal of Machine Learning Research, vol. 2, iss. 1, p. 1-45. ISSN: 1533-7928.
Detail

HANÁK, J.; NOVÁK, J.; CHUDÝ, P.; BEN-ASHER, J. Cross-Entropy Method for Laser Defense Applications. Journal of Aerospace Information Systems, 2025, vol. 22, iss. 1, p. 53-58. ISSN: 2327-3097.
Detail

VILLATORO-TELLO, E.; MADIKERI, S.; ZULUAGA-GOMEZ, J.; SHARMA, B.; SARFJOO, S.; NIGMATULINA, I.; MOTLÍČEK, P.; IVANOV, V.; GANAPATHIRAJU, A. Effectiveness of Text, Acoustic, and Lattice-Based Representations in Spoken Language Understanding Tasks. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Rhodes Island: IEEE Signal Processing Society, 2023. p. 1-5. ISBN: 978-1-7281-6327-7.
Detail

OMACHTOVÁ, A.; HEROUT, A.; BAMBUŠEK, D.; JUŘÍK, V. How to shoot yourself right with a smartphone?. Virtual reality, 2023, vol. 2023, iss. 1, p. 1-13. ISSN: 1434-9957.
Detail

REPKA, S.; REICH, B.; ZOLOTAREV, F.; EEROLA, T.; ZEMČÍK, P. Mineral segmentation using electron microscope images and spectral sampling through multimodal graph neural networks. Pattern Recognition Letters, 2025, vol. 193, iss. 193, p. 79-85.
Detail

ZULUAGA-GOMEZ, J.; PRASAD, A.; NIGMATULINA, I.; MOTLÍČEK, P.; KLEINERT, M.;. A Virtual Simulation-Pilot Agent for Training of Air Traffic Controllers. Aerospace, 2023, vol. 10, iss. 5, p. 1-25. ISSN: 2226-4310.
Detail

HANÁK, J.; NOVÁK, J.; CHUDÝ, P. Tactical Scenario Adaptation for Pilot Training. In AIAA/IEEE Digital Avionics Systems Conference - Proceedings. IEEE/AIAA ... Digital Avionics Systems Conference. San Diego: Institute of Electrical and Electronics Engineers, 2024. iss. 9, p. 1-7. ISBN: 979-8-3503-4961-0. ISSN: 2155-7195.
Detail

CHLUBNA, T.; MILET, T.; ZEMČÍK, P. Lightweight All-Focused Light Field Rendering. COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, vol. 244, iss. 7, p. 7-8. ISSN: 1077-3142.
Detail

APAROVICH, M.; KESIRAJU, S.; DUFKOVÁ, A.; SMRŽ, P. FIT BUT at SemEval-2023 Task 12: Sentiment Without Borders - Multilingual Domain Adaptation for Low-Resource Sentiment Classification. In Proceedings of the The 17th International Workshop on Semantic Evaluation (SemEval-2023). Toronto (online): Association for Computational Linguistics, 2023. p. 1518-1524. ISBN: 978-1-959429-99-9.
Detail

NOVÁK, J.; HANÁK, J.; CHUDÝ, P. Predictive Control Driven Tactical Maneuvering. In ICAS Proceedings. ICAS proceedings. Florence: International Council of the Aeronautical Sciences, 2024. iss. 9, p. 1-12. ISSN: 2958-4647.
Detail

BURDISSO, S.; VILLATORO-TELLO, E.; MADIKERI, S.; MOTLÍČEK, P. Node-weighted Graph Convolutional Network for Depression Detection in Transcribed Clinical Interviews. In Proceedings of the Annual Conference of International Speech Communication Association, INTERSPEECH. Proceedings of Interspeech. Dublin: International Speech Communication Association, 2023. iss. 8, p. 3617-3621. ISSN: 1990-9772.
Detail

KHALIL, D.; PRASAD, A.; MOTLÍČEK, P.; ZULUAGA-GOMEZ, J.; NIGMATULINA, I.; MADIKERI, S.; SCHUEPBACH, C. An Automatic Speaker Clustering Pipeline for the Air Traffic Communication Domain. Aerospace, 2023, vol. 10, iss. 10, p. 1-14. ISSN: 2226-4310.
Detail

SKOWRON, M.; BACKFRIED, G.; NAVAS, E.; BERZINŠ, A.; VAN, J.; DE, F.; DEMARCO, A.; POLÁK, P.; KOVÁČ, M.; POLÁK, P.; ROHDIN, J.; ROSNER, M.; SANCHEZ, J.; SARATXAGA, I.; SCHWARZ, P. Deep Dive Speech Technology. In European Language Equality. Cham: Springer Nature Switzerland AG, 2023. p. 289.ISBN: 978-3-031-28819-7.
Detail

VAŠKO, M.; HEROUT, A. LossFIQA: A Shortcut Solution to Image Quality Assessment Using Loss for Faces and Beyond. IEEE Access, 2025, vol. 13, iss. 7, p. 126915-126924.
Detail

MAI, F.; ZULUAGA-GOMEZ, J.; PARCOLLET, T.; MOTLÍČEK, P. HyperConformer: Multi-head HyperMixer for Efficient Speech Recognition. In Proceedings of the Annual Conference of International Speech Communication Association, INTERSPEECH. Proceedings of Interspeech. Dublin: International Speech Communication Association, 2023. iss. 08, p. 2213-2217. ISSN: 1990-9772.
Detail

VANDERREYDT, G.; PRASAD, A.; KHALIL, D.; MADIKERI, S.; DEMUYNCK, K.; MOTLÍČEK, P. Parameter-Efficient Tuning With Adaptive Bottlenecks For Automatic Speech Recognition. Proceedings of IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). Taipei: IEEE Signal Processing Society, 2023. p. 1.ISBN: 979-8-3503-0689-7.
Detail

HANÁK, J.; CHUDÝ, P.; VLK, J. Collaborative Agents for Synthetic Tactical Training. In AIAA/IEEE Digital Avionics Systems Conference - Proceedings. IEEE/AIAA ... Digital Avionics Systems Conference. Barcelona: Institute of Electrical and Electronics Engineers, 2023. iss. 10, p. 1-9. ISBN: 979-8-3503-3357-2. ISSN: 2155-7195.
Detail

NOVÁK, J.; CHUDÝ, P. Dynamic Soaring in Uncertain Wind Conditions: Polynomial Chaos Expansion Approach. In Machine Learning, Optimization, and Data Science. Lecture Notes in Computer Science. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Grasmere: Springer Nature Switzerland AG, 2024. iss. 14505, p. 104-115. ISBN: 978-3-031-53968-8. ISSN: 0302-9743.
Detail

BOITO, M.; YUSUF, B.; ONDEL YANG, L.; VILLAVICENCIO, A.; BESACIER, L. Unsupervised Word Segmentation from Discrete Speech Units in Low-Resource Settings. In Proceedings of the the 1st Annual Meeting of the ELRA/ISCA Special Interest Group on Under-Resourced Languages. Marseile: European Language Resources Association, 2022. p. 1-9. ISBN: 979-10-95546-91-7.
Detail

ESPUNA, A.; PRASAD, A.; MOTLÍČEK, P.; MADIKERI, S.; SCHUEPBACH, C. Normalising Flows for Speaker and Language Recognition Backend. Proceedings of Odyssey 2024: The Speaker and Language Recognition Workshop. Quebec: International Speech Communication Association, 2024. p. 74.
Detail

NIGMATULINA, I.; MADIKERI, S.; VILLATORO-TELLO, E.; MOTLÍČEK, P.; ZULUAGA-GOMEZ, J.; PANDIA, K.; GANAPATHIRAJU, A. Implementing contextual biasing in GPU decoder for online ASR. In Proceedings of the Annual Conference of International Speech Communication Association, INTERSPEECH. Proceedings of Interspeech. Dublin: International Speech Communication Association, 2023. iss. 8, p. 4494-4498. ISSN: 1990-9772.
Detail

POLÁŠEK, T.; ČADÍK, M. Predicting Photovoltaic Power Production using High-Uncertainty Weather Forecasts. Applied Energy, 2023, vol. 2023, iss. 339, p. 120989-121004. ISSN: 0306-2619.
Detail

BENEŠ, K.; KOCOUR, M.; BURGET, L. Hystoc: Obtaining Word Confidences for Fusion of End-To-End ASR Systems. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Seoul: IEEE Signal Processing Society, 2024. p. 11276-11280. ISBN: 979-8-3503-4485-1.
Detail

BOBÁK, P.; ČMOLÍK, L.; ČADÍK, M. Reinforced Labels: Multi-Agent Deep Reinforcement Learning for Point-Feature Label Placement. IEEE transactions on visualization and computer graphics, 2024, vol. 30, iss. 9, p. 5908-5922. ISSN: 1077-2626.
Detail

KUBÍK, T.; ŠPANĚL, M. LMVSegRNN and Poseidon3D: Addressing Challenging Teeth Segmentation Cases in 3D Dental Surface Orthodontic Scans. Bioengineering-Basel, 2024, vol. 11, iss. 10, p. 1-18. ISSN: 2306-5354.
Detail

ŠILLING, P.; ŠPANĚL, M. DEMIS: Electron Microscopy Image Stitching using Deep Learning Features and Global Optimisation. Proceedings of the 18th International Joint Conference on Biomedical Engineering Systems and Technologies - BIOIMAGING. Porto: Institute for Systems and Technologies of Information, Control and Communication, 2025. p. 255.ISBN: 978-989-758-731-3.
Detail

KUMAR, S.; MADIKERI, S.; NIGMATULINA, I.; VILLATORO-TELLO, E.; MOTLÍČEK, P.; PANDIA, K.; DUBAGUNTA, P.; GANAPATHIRAJU, A. Multitask Speech Recognition and Speaker Change Detection for Unknown Number of Speakers. ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Seoul: IEEE Signal Processing Society, 2024. p. 12592.ISBN: 979-8-3503-4485-1.
Detail

BURDISSO, S.; REYES-RAMÍREZ, E.; VILLATORO-TELLO, E.; SÁNCHEZ-VEGA, F.; LOPEZ MONROY, A.; MOTLÍČEK, P. DAIC-WOZ: On the Validity of Using the Therapist's prompts in Automatic Depression Detection from Clinical Interviews. In Proceedings of the 6th Clinical Natural Language Processing Workshop, ClinicalNLP@NAACL. Mexico City: Association for Computational Linguistics, 2024. p. 82-90. ISBN: 979-8-89176-109-4.
Detail

MOTLÍČEK, P.; PRASAD, A.; NIGMATULINA, I.; HELMKE, H.; OHNEISER, O.; KLEINERT, M. Automatic Speech Analysis Framework for ATC Communication in HAAWAII. In SESAR Innovation Days. SESAR Innovation Days. Seville: SESAR Joint Undertaking, 2023. iss. 11, p. 1-9. ISSN: 0770-1268.
Detail

RANGAPPA, P.; MUSCAT, A.; SANCHEZ LARA, A.; MOTLÍČEK, P.; ANTONOPOULOU, M.; FOURFOURIS, I.; SKARLATOS, A.; AVGERINOS, N.; TSANGARIS, M.; KOSTKA, K. Detecting Criminal Networks via Non-Content Communication Data Analysis Techniques from the TRACY Project. In Proceedings of the15th EAI International Conference on Digital Forensics & Cyber Crime (EAI-ICDF2C24). Lecture notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering. Dubrovnik: SPRINGER INTERNATIONAL PUBLISHING AGGEWERBESTRASSE 11, CHAM, CH-6330, SWITZERLAND, 2024. iss. 11, p. 340-353.
Detail

MACIEJEWSKI, M.; KLEMENT, D.; HUANG, R.; WIESNER, M.; KHUDANPUR, S. Evaluating the Santa Barbara Corpus: Challenges of the Breadth of Conversational Spoken Language. In Proceedings of Interspeech 2024. Proceedings of Interspeech. Kos: International Speech Communication Association, 2024. iss. 9, p. 2155-2160. ISSN: 1990-9772.
Detail

POLÁŠEK, T.; ČADÍK, M.; KELLER, Y.; BENEŠ, B. Vision UFormer: Long-Range Monocular Absolute Depth Estimation. COMPUTERS & GRAPHICS-UK, 2023, vol. 111, iss. 4, p. 180-189. ISSN: 0097-8493.
Detail

BAMBUŠEK, D.; MATERNA, Z.; KAPINUS, M.; BERAN, V.; SMRŽ, P. How Do I Get There? Overcoming Reachability Limitations of Constrained Industrial Environments in Augmented Reality Applications. In 2023 IEEE Conference on Virtual Reality and 3D User Interfaces (VR). Shanghai: Institute of Electrical and Electronics Engineers, 2023. p. 115-122. ISBN: 979-8-3503-4815-6.
Detail

BAŘINA, D. Experimental lossless data compressor. Microprocessors and Microsystems, 2023, vol. 98, iss. 4, p. 104803-104803. ISSN: 0141-9331.
Detail

KUBÍK, T.; GUIBAULT, F.; ŠPANĚL, M.; LOMBAERT, H. ToothForge: Automatic Dental Shape Generation using Synchronized Spectral Embeddings. In Proceedings of Information Processing in Medical Imaging 2025. Lecture Notes in Computer Science. Kos: Springer Science and Business Media Deutschland GmbH, 2025. p. 313-326. ISBN: 9783031966248.
Detail

BHATTACHARJEE, M.; MOTLÍČEK, P.; NIGMATULINA, I.; HELMKE, H.; OHNEISER, O.; KLEINERT, M.; EHR, H. Customization of Automatic Speech Recognition Engines for Rare Word Detection Without Costly Model Re-Training. 13th SESAR Innovation Days 2023, SIDS 2023. SESAR Innovation Days. Seville: SESAR Joint Undertaking, 2023. iss. 11, p. 1.ISSN: 0770-1268.
Detail

ASHIHARA, T.; MORIYA, T.; HORIGUCHI, S.; PENG, J.; OCHIAI, T.; DELCROIX, M.; MATSUURA, K.; SATO, H. Investigation of Speaker Representation for Target-Speaker Speech Processing. Proc. 2024 IEEE Spoken Language Technology Workshop (SLT). Macao: IEEE Signal Processing Society, p. 423.ISBN: 979-8-3503-9225-8.
Detail

BAŘINA, D. Improved verification limit for the convergence of the Collatz conjecture. Journal of supercomputing, 2025, vol. 81, iss. 1, p. 1-14. ISSN: 1573-0484.
Detail

GAVRIELIDES, A.; SOPHOCLEOUS, M.; AGAPIOU, G.; LESSI, C.; ŠPAŇHEL, J.; LENDINEZ, A.; QIU, R.; LI, D. Implementing Network Applications for 5G-Enabled Robots Through the 5G-ERA Platform. In IFIP Advances in Information and Communication Technology. IFIP Advances in Information and Communication Technology. Artificial Intelligence Applications and Innovations. Cham: Springer Nature Switzerland AG, 2023. iss. 2023, p. 55-65. ISBN: 978-3-031-34170-0. ISSN: 1868-422X.
Detail

KIŠŠ, M.; HRADIŠ, M. Self-supervised Pre-training of Text Recognizers. In Barney Smith, E.H., Liwicki, M., Peng, L. (eds) Document Analysis and Recognition - ICDAR 2024. Lecture Notes in Computer Science. Atény: Springer Nature Switzerland AG, 2024. p. 218-235. ISBN: 978-3-031-70545-8.
Detail

CHLUBNA, T.; MILET, T.; ZEMČÍK, P. How Capturing Camera Trajectory Distortion Affects User Experience on Looking Glass 3D Display. MULTIMEDIA TOOLS AND APPLICATIONS, 2024, vol. 2024, iss. 83, p. 20265-20287. ISSN: 1573-7721.
Detail

YUSUF, B.; BASKAR, M.; ROSENBERG, A.; RAMABHADRAN, B. Speculative Speech Recognition by Audio-Prefixed Low-Rank Adaptation of Language Models. In Proceedings of Interspeech 2024. Proceedings of Interspeech. Kos: International Speech Communication Association, 2024. iss. 9, p. 792-796. ISSN: 1990-9772.
Detail

NOVÁK, J.; HANÁK, J.; CHUDÝ, P. Hybrid Modeling Approach for Optimization Based Control of Multirotor Unmanned Aerial Vehicles. In ICAS Proceedings. ICAS proceedings. Florence: International Council of the Aeronautical Sciences, 2024. iss. 10, p. 1-10. ISSN: 2958-4647.
Detail

NOVÁK, J.; HANÁK, J.; CHUDÝ, P. Reliability-Based Control System Optimization in Uncertain Conditions. In AIAA Aviation Forum and ASCEND, 2024. Las Vegas: American Institute of Aeronautics and Astronautics, 2024. p. 1-15. ISBN: 978-1-62410-716-0.
Detail

NOVÁK, J.; CHUDÝ, P. Surrogate Modeling of Optimal Control Based Collision Avoidance System for Multirotor Unmanned Aerial Vehicles. In AIAA/IEEE Digital Avionics Systems Conference - Proceedings. IEEE/AIAA ... Digital Avionics Systems Conference. Barcelona: Institute of Electrical and Electronics Engineers, 2023. iss. 10, p. 1-7. ISBN: 979-8-3503-3357-2. ISSN: 2155-7195.
Detail

REPKA, S.; EEROLA, T.; MOTL, D.; VÝRAVSKÝ, J.; ZEMČÍK, P. Unsupervised Mineral Segmentation with Graph Neural Networks and Multi-modal SEM Data. In Lecture Notes in Computer Science. Lecture Notes in Computer Science. Cham: Springer Nature, 2026. p. 25-36. ISBN: 978-3-032-05059-5.
Detail

LI, S.; WANG, S.; HAN, J.; ZHANG, K.; WANG, W.; LI, H. REAL-T: Real Conversational Mixtures for Target Speaker Extraction. In Proceedings of the Annual Conference of the International Speech Communication Association Interspeech. Interspeech. Rotterdam, The Netherlands: International Speech Communication Association, 2025. p. 1923-1927.
Detail

CHLUBNA, T.: Depth-of-field from depth map. URL: https://github.com/ichlubna/DoFFromDepthMap. (Software)
Detail

ZHANG, R.; WEI, J.; LU, X.; ZHANG, L.; JIN, D.; LU, W.; XU, J. Multi-Sinkhorn Teacher Knowledge Aggregation Framework for Adaptive Audio Anti-Spoofing. IEEE Transactions on Audio, Speech, and Language Processing, 2025, iss. 33, p. 3850-3865.
Detail

LUONG, H.; LI, H.; ZHANG, L.; LEE, K.; CHNG, E. LlamaPartialSpoof: An LLM-Driven Fake Speech Dataset Simulating Disinformation Generation. In Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing. Hyderabad, Indická republika: Institute of Electrical and Electronics Engineers Inc., 2025. p. 1-5. ISBN: 979-8-3503-6874-1.
Detail

BAŘINA, D.: linalg: linear algebra. URL: https://github.com/xbarin02/linalg. (Software)
Detail

VISHWANATH, U.; BHATTACHARJEE, T.; DEEKSHITHA, G.; UDUPA, S.; THIRUMALA, K.; KEERTHIPRIYA, M.; CHIKKTIMMEGOWDA, D.; BASKAR, D.; BELUR, Y.; VENGALIL, S.; NALINI, A.; GHOSH, P. Comparison of Acoustic and Textual Features for Dysarthria Severity Classification in Amyotrophic Lateral Sclerosis. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Interspeech. Rotterdam, The Netherlands: Isca-Int Speech Communication Assoc, 2025. p. 803-807.
Detail

CHLUBNA, T.: Light field rendering with tensor cores. URL: https://github.com/ichlubna/lfInterpolator. (Software)
Detail

CHLUBNA, T.; MILET, T.; ZEMČÍK, P. Out-of-Focus Artifacts Mitigation and Autofocus Methods for 3D Displays. Visual Informatics, 2024, vol. 9, iss. 1, p. 31-42. ISSN: 2468-502X.
Detail

HELMKE, H.; KLEINERT, M.; AHRENHOLD, N.; EHR, H.; MÜHLHAUSEN, T.; PINSKA, E.; OHNEISER, O.; KLAMERT, L.; MOTLÍČEK, P.; PRASAD, A.; ZULUAGA-GOMEZ, J.; DOKIC, J. Automatic Speech Recognition and Understanding for Radar Label Maintenance Support Increases Safety and Reduces Air Traffic Controllers' Workload. Proceedings of ATM Seminar. Savannah, Georgia: EUROPEAN ORGANISATION FOR THE SAFETY OF AIR NAVIGATION, 2023. p. 1.
Detail

PEŠÁN, J.; JUŘÍK, V.; KARAFIÁT, M.; ČERNOCKÝ, J. BESST Dataset: A Multimodal Resource for Speech-based Stress Detection and Analysis. In Proceedings of Interspeech 2024. Proceedings of Interspeech. Kos: International Speech Communication Association, 2024. iss. 9, p. 1355-1359. ISSN: 1990-9772.
Detail

CHLUBNA, T.; ZEMČÍK, P.; MILET, T. Efficient Random-Access GPU Video Decoding for Light-Field Rendering. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, vol. 2024, iss. 102, p. 1-14. ISSN: 1047-3203.
Detail

YUSUF, B.; GOURAV, A.; GANDHE, A.; BULYKO, I. On-the-Fly Text Retrieval for end-to-end ASR Adaptation. In Proceedings of ICASSP 2023. Rhodes Island: IEEE Signal Processing Society, 2023. p. 1-5. ISBN: 978-1-7281-6327-7.
Detail

CHLUBNA, T.; MILET, T.; ZEMČÍK, P. How Color Profile Affects the Visual Quality in Light Field Rendering and Novel View Synthesis. MULTIMEDIA TOOLS AND APPLICATIONS, 2025, vol. 84, iss. 14, p. 11079-11095.
Detail

Responsibility: Zemčík Pavel, prof. Dr. Ing., dr. h. c.

VUT

Faculties and university institutes

Parts

Soudobé metody zpracování, analýzy a zobrazování multimediálních a 3D dat