Identification of Capture Phases in Nanopore Protein Sequencing Data Using a Deep Learning Model
Martin, Annabelle, Kontogiorgos-Heintz, Daphne, Nivala, Jeff
–arXiv.org Artificial Intelligence
Nanopore protein sequencing produces long, noisy ionic current traces in which key molecular phases, such as protein capture and translocation, are embedded. Capture phases mark the successful entry of a protein into the pore and serve as both a checkpoint and a signal that a channel merits further analysis. However, manual identification of capture phases is time-intensive, often requiring several days for expert reviewers to annotate the data due to the need for domain-specific interpretation of complex signal patterns. To address this, a lightweight one-dimensional convolutional neural network (1D CNN) was developed and trained to detect capture phases in down-sampled signal windows. Evaluated against CNN-LSTM (Long Short-Term Memory) hybrids, histogram-based classifiers, and other CNN variants using run-level data splits, our best model, CaptureNet-Deep, achieved an F1 score of 0.94 and precision of 93.39% on held-out test data. The model supports low-latency inference and is integrated into a dashboard for Oxford Nanopore experiments, reducing the total analysis time from several days to under thirty minutes. These results show that efficient, real-time capture detection is possible using simple, interpretable architectures and suggest a broader role for lightweight ML models in sequencing workflows.
arXiv.org Artificial Intelligence
Nov-4-2025