FDA
Drug Repurposing Using Deep Embedded Clustering and Graph Neural Networks
Delzer, Luke, Kroleski, Robert, AlShami, Ali K., Kalita, Jugal
Drug repurposing has historically been an economically infeasible process for identifying novel uses for abandoned drugs. Modern machine learning has enabled the identification of complex biochemical intricacies in candidate drugs; however, many studies rely on simplified datasets with known drug-disease similarities. We propose a machine learning pipeline that uses unsupervised deep embedded clustering, combined with supervised graph neural network link prediction to identify new drug-disease links from multi-omic data. Unsupervised autoencoder and cluster training reduced the dimensionality of omic data into a compressed latent embedding. A total of 9,022 unique drugs were partitioned into 35 clusters with a mean silhouette score of 0.8550. Graph neural networks achieved strong statistical performance, with a prediction accuracy of 0.901, receiver operating characteristic area under the curve of 0.960, and F1-Score of 0.901. A ranked list comprised of 477 per-cluster link probabilities exceeding 99 percent was generated. This study could provide new drug-disease link prospects across unrelated disease domains, while advancing the understanding of machine learning in drug repurposing studies.
Human-AI Collaboration Increases Efficiency in Regulatory Writing
Eser, Umut, Gozin, Yael, Stallons, L. Jay, Caroline, Ari, Preusse, Martin, Rice, Brandon, Wright, Scott, Robertson, Andrew
Background: Investigational New Drug (IND) application preparation is time-intensive and expertise-dependent, slowing early clinical development. Objective: To evaluate whether a large language model (LLM) platform (AutoIND) can reduce first-draft composition time while maintaining document quality in regulatory submissions. Methods: Drafting times for IND nonclinical written summaries (eCTD modules 2.6.2, 2.6.4, 2.6.6) generated by AutoIND were directly recorded. For comparison, manual drafting times for IND summaries previously cleared by the U.S. FDA were estimated from the experience of regulatory writers ($\geq$6 years) and used as industry-standard benchmarks. Quality was assessed by a blinded regulatory writing assessor using seven pre-specified categories: correctness, completeness, conciseness, consistency, clarity, redundancy, and emphasis. Each sub-criterion was scored 0-3 and normalized to a percentage. A critical regulatory error was defined as any misrepresentation or omission likely to alter regulatory interpretation (e.g., incorrect NOAEL, omission of mandatory GLP dose-formulation analysis). Results: AutoIND reduced initial drafting time by $\sim$97% (from $\sim$100 h to 3.7 h for 18,870 pages/61 reports in IND-1; and to 2.6 h for 11,425 pages/58 reports in IND-2). Quality scores were 69.6\% and 77.9\% for IND-1 and IND-2. No critical regulatory errors were detected, but deficiencies in emphasis, conciseness, and clarity were noted. Conclusions: AutoIND can dramatically accelerate IND drafting, but expert regulatory writers remain essential to mature outputs to submission-ready quality. Systematic deficiencies identified provide a roadmap for targeted model improvements.
A Survey of Graph Neural Networks for Drug Discovery: Recent Developments and Challenges
Berry, Katherine, Cheng, Liang
Graph Neural Networks (GNNs) have gained traction in the complex domain of drug discovery because of their ability to process graph-structured data such as drug molecule models. This approach has resulted in a myriad of methods and models in published literature across several categories of drug discovery research. This paper covers the research categories comprehensively with recent papers, namely molecular property prediction, including drug-target binding affinity prediction, drug-drug interaction study, microbiome interaction prediction, drug repositioning, retrosynthesis, and new drug design, and provides guidance for future work on GNNs for drug discovery.
Evaluation of Machine Learning Reconstruction Techniques for Accelerated Brain MRI Scans
Mandel, Jonathan I., Hiremath, Shivaprakash, Keshtgar, Hedyeh, Scholl, Timothy, Raeisi, Sadegh
Figure 3: Distribution of Structural Similarity Index (SSIM), Peak Signal-to-Noise Ratio (PSNR), and Haar wavelet-based Perceptual Similarity Index (HaarPSI) scores for DeepFoqus-Accelerate reconstructions: (a-c) show results across 408 samples at 2x, 3x, and 4x acceleration, and (d-f) present distributions for the 36 image sets evaluated by reviewers. Figure 4: (A-B) Representative standard-of-care (SOC) images (first row) and DeepFoqus-Accelerate reconstructions from accelerated scans (second row), with corresponding quantitative and qualitative scores presented in the third row. Panel (B) shows two slices of the worst-case scenario in the qualitative dataset, characterized by wrap-around and motion artifacts. Discussion This evaluation of DeepFoqus-Accelerate demonstrates that this FDA-cleared k-space-based DL reconstruction software can reliably enable up to fourfold accelerated brain MRI acquisition without compromising diagnostic image quality. Both expert review and quantitative image similarity metrics confirm that AI-reconstructed images are clinically equivalent to fully sampled standards.
Moderna CEO Responds to RFK Jr.'s Crusade Against the Covid-19 Vaccine
Speaking at a WIRED event Tuesday, Moderna CEO Stรฉphane Bancel said he was "encouraged" by the company's dialogue with the FDA--but acknowledged recent setbacks. Moderna CEO Stรฉphane Bancel prepares to testify before the Senate on March 22, 2023 in Washington, DC. At the WIRED Health summit on Tuesday, Moderna CEO Stรฉphane Bancel said the recent changes to Covid-19 vaccine policy made by Health and Human Services secretary Robert F. Kennedy, Jr. are a "step backward." Moderna is one of the manufacturers of mRNA-based Covid-19 vaccines, and last month the company received approval from the Food and Drug Administration for an updated version of the shot . But as part of that approval, the FDA imposed new restrictions on who can receive the vaccine.
Dangerous heart conditions detected in seconds with AI stethoscope
Board-certified cardiothoracic surgeon Dr. Jeremy London, based in Savannah, Georgia, explains why VO2 max and muscle mass are the main indicators of longevity. The first artificial intelligence (AI) stethoscope has gone beyond listening to a heartbeat. Researchers at Imperial College London and Imperial College Healthcare NHS Trust discovered that an AI stethoscope can detect heart failure at an early stage. The TRICORDER study results, published in BMJ Journals, found that the AI-enabled stethoscope can help doctors identify three heart conditions in just 15 seconds. According to the British Heart Foundation (BHF), which partially funded the study, the researchers analyzed data from more than 1.5 million patients, focusing on people with heart failure symptoms like breathlessness, swelling and fatigue.
Resilient Biosecurity in the Era of AI-Enabled Bioweapons
Feldman, Jonathan, Feldman, Tal
Recent advances in generative biology have enabled the design of novel proteins, creating significant opportunities for drug discovery while also introducing new risks, including the potential development of synthetic bioweapons. Existing biosafety measures primarily rely on inference-time filters such as sequence alignment and protein-protein interaction (PPI) prediction to detect dangerous outputs. In this study, we evaluate the performance of three leading PPI prediction tools: AlphaFold 3, AF3Complex, and SpatialPPIv2. These models were tested on well-characterized viral-host interactions, such as those involving Hepatitis B and SARS-CoV-2. Despite being trained on many of the same viruses, the models fail to detect a substantial number of known interactions. Strikingly, none of the tools successfully identify any of the four experimentally validated SARS-CoV-2 mutants with confirmed binding. These findings suggest that current predictive filters are inadequate for reliably flagging even known biological threats and are even more unlikely to detect novel ones. We argue for a shift toward response-oriented infrastructure, including rapid experimental validation, adaptable biomanufacturing, and regulatory frameworks capable of operating at the speed of AI-driven developments.
Towards Early Detection: AI-Based Five-Year Forecasting of Breast Cancer Risk Using Digital Breast Tomosynthesis Imaging
Dorster, Manon A., Dorfner, Felix J., Cleveland, Mason C., Guelen, Melisa S., Patel, Jay, Daye, Dania, Thiran, Jean-Philippe, Kim, Albert E., Bridge, Christopher P.
As early detection of breast cancer strongly favors successful therapeutic outcomes, there is major commercial interest in optimizing breast cancer screening. However, current risk prediction models achieve modest performance and do not incorporate digital breast tomosynthesis (DBT) imaging, which was FDA-approved for breast cancer screening in 2011. To address this unmet need, we present a deep learning (DL)-based framework capable of forecasting an individual patient's 5-year breast cancer risk directly from screening DBT. Using an unparalleled dataset of 161,753 DBT examinations from 50,590 patients, we trained a risk predictor based on features extracted using the Meta AI DINOv2 image encoder, combined with a cumulative hazard layer, to assess a patient's likelihood of developing breast cancer over five years. On a held-out test set, our best-performing model achieved an AUROC of 0.80 on predictions within 5 years. These findings reveal the high potential of DBT-based DL approaches to complement traditional risk assessment tools, and serve as a promising basis for additional investigation to validate and enhance our work.
Active Query Selection for Crowd-Based Reinforcement Learning
Erskine, Jonathan, Yamagata, Taku, Santos-Rodrรญguez, Raรบl
Preference-based reinforcement learning has gained prominence as a strategy for training agents in environments where the reward signal is difficult to specify or misaligned with human intent. However, its effectiveness is often limited by the high cost and low availability of reliable human input, especially in domains where expert feedback is scarce or errors are costly. To address this, we propose a novel framework that combines two complementary strategies: probabilistic crowd modelling to handle noisy, multi-annotator feedback, and active learning to prioritize feedback on the most informative agent actions. We extend the Advise algorithm to support multiple trainers, estimate their reliability online, and incorporate entropy-based query selection to guide feedback requests. We evaluate our approach in a set of environments that span both synthetic and real-world-inspired settings, including 2D games (Taxi, Pacman, Frozen Lake) and a blood glucose control task for Type 1 Diabetes using the clinically approved UVA/Padova simulator. Our preliminary results demonstrate that agents trained with feedback on uncertain trajectories exhibit faster learning in most tasks, and we outperform the baselines for the blood glucose control task.
Multi-domain Distribution Learning for De Novo Drug Design
Schneuing, Arne, Igashov, Ilia, Dobbelstein, Adrian W., Castiglione, Thomas, Bronstein, Michael, Correia, Bruno
To further enhance the sampling process towards distribution regions with desirable metric values, we propose a joint preference alignment scheme applicable to both flow matching and Markov bridge frameworks. Furthermore, we extend our model to also explore the conformational landscape of the protein by jointly sampling side chain angles and molecules. Small molecules are the predominant class of FDA-approved drugs with a share of 85%, and more than 95% of known drugs target human or pathogen proteins (Santos et al., 2017). At the same time, the cost and duration of the development of new drugs are skyrocketing (Simoens & Huys, 2021). This sparks increasing interest in the computational design of small molecular compounds that bind specifically to disease-associated proteins and thus reduce the amount of costly experimental testing. In recent years, the machine learning community has contributed a plethora of generative tools addressing drug design from various angles (Du et al., 2024). However, these methods typically require careful tuning of the objective function to avoid exploiting imperfect computational oracles and overly maximizing one desired property (e.g. Additionally, one often aims to design a suitable 3D binding pose along with the chemical structure of the molecule, which substantially increases the degrees of freedom. Many optimization algorithms struggle to efficiently navigate such vast design spaces. Following a different approach, probabilistic generative models learn to generate drug-like molecules directly from data (Hoogeboom et al., 2022; Vignac et al., 2022). Here, the design objectives are implicitly encoded in the training data set. While these methods may not outperform direct optimization on isolated metrics, they are well suited for the multifaceted nature of drug design as they learn "what a drug looks like" in a more general way. Once trained on sufficient high-quality data, these models can capture a more holistic picture of the molecular space compared to models optimized for a limited set of target metrics. The strength of generative modeling lies in its ability to reproduce patterns seen in the training data.