Goto

Collaborating Authors

 therapeutic


Diffusion-Driven Generation of Minimally Preprocessed Brain MRI

Remedios, Samuel W., Carass, Aaron, Prince, Jerry L., Dewey, Blake E.

arXiv.org Artificial Intelligence

The purpose of this study is to present and compare three denoising diffusion probabilistic models (DDPMs) that generate 3D $T_1$-weighted MRI human brain images. Three DDPMs were trained using 80,675 image volumes from 42,406 subjects spanning 38 publicly available brain MRI datasets. These images had approximately 1 mm isotropic resolution and were manually inspected by three human experts to exclude those with poor quality, field-of-view issues, and excessive pathology. The images were minimally preprocessed to preserve the visual variability of the data. Furthermore, to enable the DDPMs to produce images with natural orientation variations and inhomogeneity, the images were neither registered to a common coordinate system nor bias field corrected. Evaluations included segmentation, Frechet Inception Distance (FID), and qualitative inspection. Regarding results, all three DDPMs generated coherent MR brain volumes. The velocity and flow prediction models achieved lower FIDs than the sample prediction model. However, all three models had higher FIDs compared to real images across multiple cohorts. In a permutation experiment, the generated brain regional volume distributions differed statistically from real data. However, the velocity and flow prediction models had fewer statistically different volume distributions in the thalamus and putamen. In conclusion this work presents and releases the first 3D non-latent diffusion model for brain data without skullstripping or registration. Despite the negative results in statistical testing, the presented DDPMs are capable of generating high-resolution 3D $T_1$-weighted brain images. All model weights and corresponding inference code are publicly available at https://github.com/piksl-research/medforj .


A text-to-tabular approach to generate synthetic patient data using LLMs

Tornqvist, Margaux, Zucker, Jean-Daniel, Fauvel, Tristan, Lambert, Nicolas, Berthelot, Mathilde, Movschin, Antoine

arXiv.org Artificial Intelligence

Access to large-scale high-quality healthcare databases is key to accelerate medical research and make insightful discoveries about diseases. However, access to such data is often limited by patient privacy concerns, data sharing restrictions and high costs. To overcome these limitations, synthetic patient data has emerged as an alternative. However, synthetic data generation (SDG) methods typically rely on machine learning (ML) models trained on original data, leading back to the data scarcity problem. We propose an approach to generate synthetic tabular patient data that does not require access to the original data, but only a description of the desired database. We leverage prior medical knowledge and in-context learning capabilities of large language models (LLMs) to generate realistic patient data, even in a low-resource setting. We quantitatively evaluate our approach against state-of-the-art SDG models, using fidelity, privacy, and utility metrics. Our results show that while LLMs may not match the performance of state-of-the-art models trained on the original data, they effectively generate realistic patient data with well-preserved clinical correlations. An ablation study highlights key elements of our prompt contributing to high-quality synthetic patient data generation. This approach, which is easy to use and does not require original data or advanced ML skills, is particularly valuable for quickly generating custom-designed patient data, supporting project implementation and providing educational resources.


A Comprehensive Framework for Automated Segmentation of Perivascular Spaces in Brain MRI with the nnU-Net

Pham, William, Jarema, Alexander, Rim, Donggyu, Chen, Zhibin, Khlif, Mohamed S. H., Macefield, Vaughan G., Henderson, Luke A., Brodtmann, Amy

arXiv.org Artificial Intelligence

Background: Enlargement of perivascular spaces (PVS) is common in neurodegenerative disorders including cerebral small vessel disease, Alzheimer's disease, and Parkinson's disease. PVS enlargement may indicate impaired clearance pathways and there is a need for reliable PVS detection methods which are currently lacking. Aim: To optimise a widely used deep learning model, the no-new-UNet (nnU-Net), for PVS segmentation. Methods: In 30 healthy participants (mean$\pm$SD age: 50$\pm$18.9 years; 13 females), T1-weighted MRI images were acquired using three different protocols on three MRI scanners (3T Siemens Tim Trio, 3T Philips Achieva, and 7T Siemens Magnetom). PVS were manually segmented across ten axial slices in each participant. Segmentations were completed using a sparse annotation strategy. In total, 11 models were compared using various strategies for image handling, preprocessing and semi-supervised learning with pseudo-labels. Model performance was evaluated using 5-fold cross validation (5FCV). The main performance metric was the Dice Similarity Coefficient (DSC). Results: The voxel-spacing agnostic model (mean$\pm$SD DSC=64.3$\pm$3.3%) outperformed models which resampled images to a common resolution (DSC=40.5-55%). Model performance improved substantially following iterative label cleaning (DSC=85.7$\pm$1.2%). Semi-supervised learning with pseudo-labels (n=12,740) from 18 additional datasets improved the agreement between raw and predicted PVS cluster counts (Lin's concordance correlation coefficient=0.89, 95%CI=0.82-0.94). We extended the model to enable PVS segmentation in the midbrain (DSC=64.3$\pm$6.5%) and hippocampus (DSC=67.8$\pm$5%). Conclusions: Our deep learning models provide a robust and holistic framework for the automated quantification of PVS in brain MRI.


AskBeacon -- Performing genomic data exchange and analytics with natural language

Wickramarachchi, Anuradha, Tonni, Shakila, Majumdar, Sonali, Karimi, Sarvnaz, Kõks, Sulev, Hosking, Brendan, Rambla, Jordi, Twine, Natalie A., Jain, Yatish, Bauer, Denis C.

arXiv.org Artificial Intelligence

For the two investigated workflows, there are significant difference in the prediction of variants terms and additional phenotypic filtering terms. An intuitive comparison between the parallel and multistep extraction model is that, in the parallel workflow the models' instructions are rather simple, where the model is asked to predict only variants specific fields (variants extractor template) and other fields (filter extractor template) not concerning about the presence of the fields in the Beacon schema. Not all extracted terms in this extractor chain are valid for Beacon. A further validator template is further required here to filter out the terms that are not related to Beacon. In contrast, in the multistep workflow, both the variants and phenotypic terms are extracted only when they match with the beacon schema without the necessity of the validation prompt. Thus, although these models are predicting less terms, the extracted terms are aligned with the schema with less hallucination than the Parallel schema, as seen in previous section.


IGUANe: a 3D generalizable CycleGAN for multicenter harmonization of brain MR images

Roca, Vincent, Kuchcinski, Grégory, Pruvo, Jean-Pierre, Manouvriez, Dorian, Lopes, Renaud

arXiv.org Artificial Intelligence

In MRI studies, the aggregation of imaging data from multiple acquisition sites enhances sample size but may introduce site-related variabilities that hinder consistency in subsequent analyses. Deep learning methods for image translation have emerged as a solution for harmonizing MR images across sites. In this study, we introduce IGUANe (Image Generation with Unified Adversarial Networks), an original 3D model that leverages the strengths of domain translation and straightforward application of style transfer methods for multicenter brain MR image harmonization. IGUANe extends CycleGAN architecture by integrating an arbitrary number of domains for training through a many-to-one strategy. During inference, the model can be applied to any image, even from an unknown acquisition site, making it a universal generator for harmonization. Trained on a dataset comprising T1-weighted images from 11 different scanners, IGUANe was evaluated on data from unseen sites. The assessments included the transformation of MR images with traveling subjects, the preservation of pairwise distances between MR images within domains, the evolution of volumetric patterns related to age and Alzheimer$^\prime$s disease (AD), and the performance in age regression and patient classification tasks. Comparisons with other harmonization and normalization methods suggest that IGUANe better preserves individual information in MR images and is more suitable for maintaining and reinforcing variabilities related to age and AD. Future studies may further assess IGUANe in other multicenter contexts, either using the same model or retraining it for applications to different image modalities.


OpenMM 8: Molecular Dynamics Simulation with Machine Learning Potentials

Eastman, Peter, Galvelis, Raimondas, Peláez, Raúl P., Abreu, Charlles R. A., Farr, Stephen E., Gallicchio, Emilio, Gorenko, Anton, Henry, Michael M., Hu, Frank, Huang, Jing, Krämer, Andreas, Michel, Julien, Mitchell, Joshua A., Pande, Vijay S., Rodrigues, João PGLM, Rodriguez-Guerra, Jaime, Simmonett, Andrew C., Singh, Sukrit, Swails, Jason, Turner, Philip, Wang, Yuanqing, Zhang, Ivy, Chodera, John D., De Fabritiis, Gianni, Markland, Thomas E.

arXiv.org Artificial Intelligence

Machine learning plays an important and growing role in molecular simulation. The newest version of the OpenMM molecular dynamics toolkit introduces new features to support the use of machine learning potentials. Arbitrary PyTorch models can be added to a simulation and used to compute forces and energy. A higher-level interface allows users to easily model their molecules of interest with general purpose, pretrained potential functions. A collection of optimized CUDA kernels and custom PyTorch operations greatly improves the speed of simulations. We demonstrate these features on simulations of cyclin-dependent kinase 8 (CDK8) and the green fluorescent protein (GFP) chromophore in water. Taken together, these features make it practical to use machine learning to improve the accuracy of simulations at only a modest increase in cost.


Scientist, Machine Learning at Flagship Pioneering, Inc. - Cambridge, MA USA

#artificialintelligence

Flagship Labs 97, Inc. (FL97) is privately held, early-stage biotechnology pioneering the application of Autonomous Science to biology. At FL97 we recognize the potential for artificial intelligence to transform all aspects of the scientific method, from hypothesis generation to experimental execution. Our platform provides intelligent agents the autonomy to execute programmable experiments in closed-loop toward valuable biological and therapeutic products. FL97 is backed by Flagship Pioneering, which brings the courage, long-term vision, and resources needed to realize unreasonable results. FL97 is seeking an experienced, creative, and talented Machine Learning Scientist to join our team.


46 Groundbreaking AI-Enabled Biotech Companies of 2023

#artificialintelligence

Integration of AI in the biotech industry has brought about a revolution in the healthcare sector. The use of AI in biotech has opened up new avenues for personalized medicine, predictive diagnostics, and cutting-edge research. In this article, we take a look at the top AI-enabled biotech companies leading the charge in the integration of AI and biotechnology. Biotechnology is a branch of biology that uses living organisms, cells, and biological systems to develop new technologies and products. It encompasses a wide range of scientific disciplines, including genetics, molecular biology, microbiology, biochemistry, and chemical engineering, and it has applications in many different fields, such as agriculture, food science, medicine, and environmental protection. In biotechnology, living organisms and biological systems are used to produce and improve products, processes, and technologies.


Senior Scientist, Computational Biology at Flagship Pioneering, Inc. - Cambridge

#artificialintelligence

Flagship Pioneering has conceived of and created companies such as Moderna Therapeutics (NASDAQ: MRNA), Editas Medicine (NASDAQ: EDIT), Omega Therapeutics (NASDAQ: OMGA), Seres Therapeutics (NASDAQ: MCRB), and Indigo Agriculture. Since its launch in 2000, Flagship has applied its unique hypothesis-driven innovation process to originate and foster more than 100 scientific ventures. In 2021, Flagship Pioneering was ranked 12th globally on Fortune's "Change the World" list, an annual ranking of companies that have made a positive social and environmental impact through activities that are part of their core business strategies. Alltrna is the world's first tRNA platform company to decipher tRNA biology and pioneer tRNA therapeutics to treat thousands of diseases. Alltrna unlocks tRNA biology to correct disease.


Senior Scientist, Machine Learning at Flagship Pioneering, Inc. - Cambridge, MA

#artificialintelligence

What if… you could join an organization that creates, resources, and builds life sciences companies that invent breakthrough technologies in order to transform health care and sustainability? FL94 Inc., is a privately held, early-stage biotechnology company pioneering Protein Editing. At FL94 we create small molecules that edit protein structure and function to unlock presently undruggable targets and a broad array of novel chemistry modalities. Our platform integrates novel small molecule chemistry and chemoproteomic discovery technologies with machine learning to enable generative design of protein editing chemistries. FL94 is backed by Flagship Pioneering, bringing the courage, vision, and resources to guide FL94 from platform validation to patient impact.