Goto

Collaborating Authors

 Intes, Xavier


Enhancing Fluorescence Lifetime Parameter Estimation Accuracy with Differential Transformer Based Deep Learning Model Incorporating Pixelwise Instrument Response Function

arXiv.org Artificial Intelligence

Fluorescence Lifetime Imaging (FLI) is a critical molecular imaging modality that provides unique information about the tissue microenvironment, which is invaluable for biomedical applications. FLI operates by acquiring and analyzing photon time-of-arrival histograms to extract quantitative parameters associated with temporal fluorescence decay. These histograms are influenced by the intrinsic properties of the fluorophore, instrument parameters, time-of-flight distributions associated with pixel-wise variations in the topographic and optical characteristics of the sample. Recent advancements in Deep Learning (DL) have enabled improved fluorescence lifetime parameter estimation. However, existing models are primarily designed for planar surface samples, limiting their applicability in translational scenarios involving complex surface profiles, such as \textit{in-vivo} whole-animal or imaged guided surgical applications. To address this limitation, we present MFliNet (Macroscopic FLI Network), a novel DL architecture that integrates the Instrument Response Function (IRF) as an additional input alongside experimental photon time-of-arrival histograms. Leveraging the capabilities of a Differential Transformer encoder-decoder architecture, MFliNet effectively focuses on critical input features, such as variations in photon time-of-arrival distributions. We evaluate MFliNet using rigorously designed tissue-mimicking phantoms and preclinical in-vivo cancer xenograft models. Our results demonstrate the model's robustness and suitability for complex macroscopic FLI applications, offering new opportunities for advanced biomedical imaging in diverse and challenging settings.


Unlocking Real-Time Fluorescence Lifetime Imaging: Multi-Pixel Parallelism for FPGA-Accelerated Processing

arXiv.org Artificial Intelligence

Fluorescence lifetime imaging (FLI) is a widely used technique in the biomedical field for measuring the decay times of fluorescent molecules, providing insights into metabolic states, protein interactions, and ligand-receptor bindings. However, its broader application in fast biological processes, such as dynamic activity monitoring, and clinical use, such as in guided surgery, is limited by long data acquisition times and computationally demanding data processing. While deep learning has reduced post-processing times, time-resolved data acquisition remains a bottleneck for real-time applications. To address this, we propose a method to achieve real-time FLI using an FPGA-based hardware accelerator. Specifically, we implemented a GRU-based sequence-to-sequence (Seq2Seq) model on an FPGA board compatible with time-resolved cameras. The GRU model balances accurate processing with the resource constraints of FPGAs, which have limited DSP units and BRAM. The limited memory and computational resources on the FPGA require efficient scheduling of operations and memory allocation to deploy deep learning models for low-latency applications. We address these challenges by using STOMP, a queue-based discrete-event simulator that automates and optimizes task scheduling and memory management on hardware. By integrating a GRU-based Seq2Seq model and its compressed version, called Seq2SeqLite, generated through knowledge distillation, we were able to process multiple pixels in parallel, reducing latency compared to sequential processing. We explore various levels of parallelism to achieve an optimal balance between performance and resource utilization. Our results indicate that the proposed techniques achieved a 17.7x and 52.0x speedup over manual scheduling for the Seq2Seq model and the Seq2SeqLite model, respectively.


Compressing Recurrent Neural Networks for FPGA-accelerated Implementation in Fluorescence Lifetime Imaging

arXiv.org Artificial Intelligence

Fluorescence lifetime imaging (FLI) is an important technique for studying cellular environments and molecular interactions, but its real-time application is limited by slow data acquisition, which requires capturing large time-resolved images and complex post-processing using iterative fitting algorithms. Deep learning (DL) models enable real-time inference, but can be computationally demanding due to complex architectures and large matrix operations. This makes DL models ill-suited for direct implementation on field-programmable gate array (FPGA)-based camera hardware. Model compression is thus crucial for practical deployment for real-time inference generation. In this work, we focus on compressing recurrent neural networks (RNNs), which are well-suited for FLI time-series data processing, to enable deployment on resource-constrained FPGA boards. We perform an empirical evaluation of various compression techniques, including weight reduction, knowledge distillation (KD), post-training quantization (PTQ), and quantization-aware training (QAT), to reduce model size and computational load while preserving inference accuracy. Our compressed RNN model, Seq2SeqLite, achieves a balance between computational efficiency and prediction accuracy, particularly at 8-bit precision. By applying KD, the model parameter size was reduced by 98\% while retaining performance, making it suitable for concurrent real-time FLI analysis on FPGA during data capture. This work represents a big step towards integrating hardware-accelerated real-time FLI analysis for fast biological processes.


Cognitive-Motor Integration in Assessing Bimanual Motor Skills

arXiv.org Artificial Intelligence

Biomedical Engineering Department, Rensselaer Polytechnic Institute, NY, USA Accurate assessment of bimanual motor skills is essential across various professions, yet, traditional methods often rely on subjective assessments or focus solely on motor actions, overlooking the integral role of cognitive processes. This study introduces a novel approach by leveraging deep neural networks (DNNs) to analyze and integrate both cognitive decision-making and motor execution. We tested this methodology by assessing laparoscopic surgery skills within the Fundamentals of Laparoscopic Surgery program, which is a prerequisite for general surgery certification. Utilizing video capture of motor actions and non-invasive functional near-infrared spectroscopy (fNIRS) for measuring neural activations, our approach precisely classifies subjects by expertise level and predicts FLS behavioral performance scores, significantly surpassing traditional single-modality assessments. In this study, we introduce a novel approach by conducting a direct statistical comparative analysis between neural activations and motor actions for assessing bimanual motor skills using DNNs. We explore the synergy of these modalities in multimodal analysis, applied to precision and cognitive-demanding tasks, particularly within the Fundamentals of Laparoscopic Surgery (FLS) program (Figure 1).


One-shot domain adaptation in video-based assessment of surgical skills

arXiv.org Artificial Intelligence

Deep Learning (DL) has achieved automatic and objective assessment of surgical skills. However, the applicability of DL models is often hampered by their substantial data requirements and confinement to specific training domains. This prevents them from transitioning to new tasks with scarce data. Therefore, domain adaptation emerges as a critical element for the practical implementation of DL in real-world scenarios. Herein, we introduce A-VBANet, a novel meta-learning model capable of delivering domain-agnostic surgical skill classification via one-shot learning. A-VBANet has been rigorously developed and tested on five diverse laparoscopic and robotic surgical simulators. Furthermore, we extend its validation to operating room (OR) videos of laparoscopic cholecystectomy. Our model successfully adapts with accuracies up to 99.5% in one-shot and 99.9% in few-shot settings for simulated tasks and 89.7% for laparoscopic cholecystectomy. This research marks the first instance of a domain-agnostic methodology for surgical skill assessment, paving the way for more precise and accessible training evaluation across diverse high-stakes environments such as real-life surgery where data is scarce.


Deep Neural Networks for the Assessment of Surgical Skills: A Systematic Review

arXiv.org Artificial Intelligence

Surgical training in medical school residency programs has followed the apprenticeship model. The learning and assessment process is inherently subjective and time-consuming. Thus, there is a need for objective methods to assess surgical skills. Here, we use the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines to systematically survey the literature on the use of Deep Neural Networks for automated and objective surgical skill assessment, with a focus on kinematic data as putative markers of surgical competency. There is considerable recent interest in deep neural networks (DNN) due to the availability of powerful algorithms, multiple datasets, some of which are publicly available, as well as efficient computational hardware to train and host them. We have reviewed 530 papers, of which we selected 25 for this systematic review. Based on this review, we concluded that DNNs are powerful tools for automated, objective surgical skill assessment using both kinematic and video data. The field would benefit from large, publicly available, annotated datasets that are representative of the surgical trainee and expert demographics and multimodal data beyond kinematics and videos.