Patel, Jay
Diagnosis of diabetic retinopathy using machine learning & deep learning technique
Shah, Eric, Patel, Jay, Katheriya, Mr. Vishal, Pataliya, Parth
Fundus images are widely used for diagnosing various eye diseases, such as diabetic retinopathy, glaucoma, and age-related macular degeneration. However, manual analysis of fundus images is time-consuming and prone to errors. In this report, we propose a novel method for fundus detection using object detection and machine learning classification techniques. We use a YOLO_V8 to perform object detection on fundus images and locate the regions of interest (ROIs) such as optic disc, optic cup and lesions. We then use machine learning SVM classification algorithms to classify the ROIs into different DR stages based on the presence or absence of pathological signs such as exudates, microaneurysms, and haemorrhages etc. Our method achieves 84% accuracy and efficiency for fundus detection and can be applied for retinal fundus disease triage, especially in remote areas around the world.
Aya Dataset: An Open-Access Collection for Multilingual Instruction Tuning
Singh, Shivalika, Vargus, Freddie, Dsouza, Daniel, Karlsson, Bรถrje F., Mahendiran, Abinaya, Ko, Wei-Yin, Shandilya, Herumb, Patel, Jay, Mataciunas, Deividas, OMahony, Laura, Zhang, Mike, Hettiarachchi, Ramith, Wilson, Joseph, Machado, Marina, Moura, Luisa Souza, Krzemiลski, Dominik, Fadaei, Hakimeh, Ergรผn, Irem, Okoh, Ifeoma, Alaagib, Aisha, Mudannayake, Oshan, Alyafeai, Zaid, Chien, Vu Minh, Ruder, Sebastian, Guthikonda, Surya, Alghamdi, Emad A., Gehrmann, Sebastian, Muennighoff, Niklas, Bartolo, Max, Kreutzer, Julia, รstรผn, Ahmet, Fadaee, Marzieh, Hooker, Sara
Datasets are foundational to many breakthroughs in modern artificial intelligence. Many recent achievements in the space of natural language processing (NLP) can be attributed to the finetuning of pre-trained models on a diverse set of tasks that enables a large language model (LLM) to respond to instructions. Instruction fine-tuning (IFT) requires specifically constructed and annotated datasets. However, existing datasets are almost all in the English language. In this work, our primary goal is to bridge the language gap by building a human-curated instruction-following dataset spanning 65 languages. We worked with fluent speakers of languages from around the world to collect natural instances of instructions and completions. Furthermore, we create the most extensive multilingual collection to date, comprising 513 million instances through templating and translating existing datasets across 114 languages. In total, we contribute four key resources: we develop and open-source the Aya Annotation Platform, the Aya Dataset, the Aya Collection, and the Aya Evaluation Suite. The Aya initiative also serves as a valuable case study in participatory research, involving collaborators from 119 countries. We see this as a valuable framework for future research collaborations that aim to bridge gaps in resources.
From Unstable Contacts to Stable Control: A Deep Learning Paradigm for HD-sEMG in Neurorobotics
Tyacke, Eion, Gupta, Kunal, Patel, Jay, Katoch, Raghav, Atashzar, S. Farokh
In the past decade, there has been significant advancement in designing wearable neural interfaces for controlling neurorobotic systems, particularly bionic limbs. These interfaces function by decoding signals captured non-invasively from the skin's surface. Portable high-density surface electromyography (HD-sEMG) modules combined with deep learning decoding have attracted interest by achieving excellent gesture prediction and myoelectric control of prosthetic systems and neurorobots. However, factors like pixel-shape electrode size and unstable skin contact make HD-sEMG susceptible to pixel electrode drops. The sparse electrode-skin disconnections rooted in issues such as low adhesion, sweating, hair blockage, and skin stretch challenge the reliability and scalability of these modules as the perception unit for neurorobotic systems. This paper proposes a novel deep-learning model providing resiliency for HD-sEMG modules, which can be used in the wearable interfaces of neurorobots. The proposed 3D Dilated Efficient CapsNet model trains on an augmented input space to computationally `force' the network to learn channel dropout variations and thus learn robustness to channel dropout. The proposed framework maintained high performance under a sensor dropout reliability study conducted. Results show conventional models' performance significantly degrades with dropout and is recovered using the proposed architecture and the training paradigm.
Pressmatch: Automated journalist recommendation for media coverage with Nearest Neighbor search
Parekh, Soumya, Patel, Jay
Slating a product for release often involves pitching journalists to run stories on your press release. Good media coverage often ensures greater product reach and drives audience engagement for those products. Hence, ensuring that those releases are pitched to the right journalists with relevant interests is crucial, since they receive several pitches daily. Keeping up with journalist beats and curating a media contacts list is often a huge and time-consuming task. This study proposes a model to automate and expedite the process by recommending suitable journalists to run media coverage on the press releases provided by the user.
QU-BraTS: MICCAI BraTS 2020 Challenge on Quantifying Uncertainty in Brain Tumor Segmentation - Analysis of Ranking Scores and Benchmarking Results
Mehta, Raghav, Filos, Angelos, Baid, Ujjwal, Sako, Chiharu, McKinley, Richard, Rebsamen, Michael, Datwyler, Katrin, Meier, Raphael, Radojewski, Piotr, Murugesan, Gowtham Krishnan, Nalawade, Sahil, Ganesh, Chandan, Wagner, Ben, Yu, Fang F., Fei, Baowei, Madhuranthakam, Ananth J., Maldjian, Joseph A., Daza, Laura, Gomez, Catalina, Arbelaez, Pablo, Dai, Chengliang, Wang, Shuo, Reynaud, Hadrien, Mo, Yuan-han, Angelini, Elsa, Guo, Yike, Bai, Wenjia, Banerjee, Subhashis, Pei, Lin-min, AK, Murat, Rosas-Gonzalez, Sarahi, Zemmoura, Ilyess, Tauber, Clovis, Vu, Minh H., Nyholm, Tufve, Lofstedt, Tommy, Ballestar, Laura Mora, Vilaplana, Veronica, McHugh, Hugh, Talou, Gonzalo Maso, Wang, Alan, Patel, Jay, Chang, Ken, Hoebel, Katharina, Gidwani, Mishka, Arun, Nishanth, Gupta, Sharut, Aggarwal, Mehak, Singh, Praveer, Gerstner, Elizabeth R., Kalpathy-Cramer, Jayashree, Boutry, Nicolas, Huard, Alexis, Vidyaratne, Lasitha, Rahman, Md Monibor, Iftekharuddin, Khan M., Chazalon, Joseph, Puybareau, Elodie, Tochon, Guillaume, Ma, Jun, Cabezas, Mariano, Llado, Xavier, Oliver, Arnau, Valencia, Liliana, Valverde, Sergi, Amian, Mehdi, Soltaninejad, Mohammadreza, Myronenko, Andriy, Hatamizadeh, Ali, Feng, Xue, Dou, Quan, Tustison, Nicholas, Meyer, Craig, Shah, Nisarg A., Talbar, Sanjay, Weber, Marc-Andre, Mahajan, Abhishek, Jakab, Andras, Wiest, Roland, Fathallah-Shaykh, Hassan M., Nazeri, Arash, Milchenko1, Mikhail, Marcus, Daniel, Kotrotsou, Aikaterini, Colen, Rivka, Freymann, John, Kirby, Justin, Davatzikos, Christos, Menze, Bjoern, Bakas, Spyridon, Gal, Yarin, Arbel, Tal
Deep learning (DL) models have provided state-of-the-art performance in various medical imaging benchmarking challenges, including the Brain Tumor Segmentation (BraTS) challenges. However, the task of focal pathology multi-compartment segmentation (e.g., tumor and lesion sub-regions) is particularly challenging, and potential errors hinder translating DL models into clinical workflows. Quantifying the reliability of DL model predictions in the form of uncertainties could enable clinical review of the most uncertain regions, thereby building trust and paving the way toward clinical translation. Several uncertainty estimation methods have recently been introduced for DL medical image segmentation tasks. Developing scores to evaluate and compare the performance of uncertainty measures will assist the end-user in making more informed decisions. In this study, we explore and evaluate a score developed during the BraTS 2019 and BraTS 2020 task on uncertainty quantification (QU-BraTS) and designed to assess and rank uncertainty estimates for brain tumor multi-compartment segmentation. This score (1) rewards uncertainty estimates that produce high confidence in correct assertions and those that assign low confidence levels at incorrect assertions, and (2) penalizes uncertainty measures that lead to a higher percentage of under-confident correct assertions. We further benchmark the segmentation uncertainties generated by 14 independent participating teams of QU-BraTS 2020, all of which also participated in the main BraTS segmentation task. Overall, our findings confirm the importance and complementary value that uncertainty estimates provide to segmentation algorithms, highlighting the need for uncertainty quantification in medical image analyses.
Addressing catastrophic forgetting for medical domain expansion
Gupta, Sharut, Singh, Praveer, Chang, Ken, Qu, Liangqiong, Aggarwal, Mehak, Arun, Nishanth, Vaswani, Ashwin, Raghavan, Shruti, Agarwal, Vibha, Gidwani, Mishka, Hoebel, Katharina, Patel, Jay, Lu, Charles, Bridge, Christopher P., Rubin, Daniel L., Kalpathy-Cramer, Jayashree
Model brittleness is a key concern when deploying deep learning models in real-world medical settings. A model that has high performance at one institution may suffer a significant decline in performance when tested at other institutions. While pooling datasets from multiple institutions and re-training may provide a straightforward solution, it is often infeasible and may compromise patient privacy. An alternative approach is to fine-tune the model on subsequent institutions after training on the original institution. Notably, this approach degrades model performance at the original institution, a phenomenon known as catastrophic forgetting. In this paper, we develop an approach to address catastrophic forgetting based on elastic weight consolidation combined with modulation of batch normalization statistics under two scenarios: first, for expanding the domain from one imaging system's data to another imaging system's, and second, for expanding the domain from a large multi-institutional dataset to another single institution dataset. We show that our approach outperforms several other state-of-the-art approaches and provide theoretical justification for the efficacy of batch normalization modulation. The results of this study are generally applicable to the deployment of any clinical deep learning model which requires domain expansion.