Detail publikačního výsledku

Direct Gene Detection in Raw Nanopore Signals Using Transformer Neural Networks

VOROCHTA, J.; VÍTKOVÁ, H.; JAKUBÍČEK, R.

Originální název

Direct Gene Detection in Raw Nanopore Signals Using Transformer Neural Networks

Anglický název

Direct Gene Detection in Raw Nanopore Signals Using Transformer Neural Networks

Druh

Stať ve sborníku v databázi WoS či Scopus

Originální abstrakt

Nanopore sequencing has transformed genomics by enabling real-time analysis of DNA and RNA in a compact, cost-effective device. However, conventional workflows require a separate basecalling step to convert raw electrical signals into nucleotide sequences, which can introduce errors and delay downstream analyses such as gene detection. Here, we present a novel approach that bypasses basecalling by directly analyzing raw nanopore signals using a transformer-based neural network. By adapting a model originally designed for ECG classification, we developed a system capable of detecting specific antibiotic resistance genes in Klebsiella pneumoniae samples. Raw signals were preprocessed through downsampling, z-normalization, and segmentation into 5,000-sample windows, yielding a dataset of 13,080 labeled segments. Experimental results demonstrate that our model effectively distinguishes gene-containing segments from non-target signals, achieving up to 80% accuracy in the “no target gene” category. In contrast, accuracy for other gene categories was lower, indicating that further optimization of the model is required. This direct-signal approach not only reduces the computational burden associated with basecalling but also streamlines the workflow, promising faster diagnostic turnaround times. These findings provide a significant step toward integrating advanced deep learning methods with nanopore sequencing for rapid, on-site genomic analysis and have potential applications in clinical diagnostics and epidemiological surveillance.

Anglický abstrakt

Nanopore sequencing has transformed genomics by enabling real-time analysis of DNA and RNA in a compact, cost-effective device. However, conventional workflows require a separate basecalling step to convert raw electrical signals into nucleotide sequences, which can introduce errors and delay downstream analyses such as gene detection. Here, we present a novel approach that bypasses basecalling by directly analyzing raw nanopore signals using a transformer-based neural network. By adapting a model originally designed for ECG classification, we developed a system capable of detecting specific antibiotic resistance genes in Klebsiella pneumoniae samples. Raw signals were preprocessed through downsampling, z-normalization, and segmentation into 5,000-sample windows, yielding a dataset of 13,080 labeled segments. Experimental results demonstrate that our model effectively distinguishes gene-containing segments from non-target signals, achieving up to 80% accuracy in the “no target gene” category. In contrast, accuracy for other gene categories was lower, indicating that further optimization of the model is required. This direct-signal approach not only reduces the computational burden associated with basecalling but also streamlines the workflow, promising faster diagnostic turnaround times. These findings provide a significant step toward integrating advanced deep learning methods with nanopore sequencing for rapid, on-site genomic analysis and have potential applications in clinical diagnostics and epidemiological surveillance.

Klíčová slova

antibiotic resistance | bioinformatics | deep learning | gene detection | Nanopore sequencing | transformer neural networks

Klíčová slova v angličtině

antibiotic resistance | bioinformatics | deep learning | gene detection | Nanopore sequencing | transformer neural networks

Autoři

VOROCHTA, J.; VÍTKOVÁ, H.; JAKUBÍČEK, R.

Vydáno

01.01.2025

Nakladatel

Brno University of Technology

ISBN

9788021463202

Kniha

Proceedings II of the Conference Student Eeict

Periodikum

Proceedings II of the Conference STUDENT EEICT

Stát

Česká republika

Strany od

13

Strany do

16

Strany počet

4

BibTex

@inproceedings{BUT201495,
  author="{} and Jevhenij {Vorochta} and Helena {Vítková} and Roman {Jakubíček}",
  title="Direct Gene Detection in Raw Nanopore Signals Using Transformer Neural Networks",
  booktitle="Proceedings II of the Conference Student Eeict",
  year="2025",
  journal="Proceedings II of the Conference STUDENT EEICT",
  pages="13--16",
  publisher="Brno University of Technology",
  doi="10.13164/eeict.2025.13",
  isbn="9788021463202"
}