clinical trial data
SynRL: Aligning Synthetic Clinical Trial Data with Human-preferred Clinical Endpoints Using Reinforcement Learning
Das, Trisha, Wang, Zifeng, Shafquat, Afrah, Beigi, Mandis, Mezey, Jason, Sun, Jimeng
Each year, hundreds of clinical trials are conducted to evaluate new medical interventions, but sharing patient records from these trials with other institutions can be challenging due to privacy concerns and federal regulations. To help mitigate privacy concerns, researchers have proposed methods for generating synthetic patient data. However, existing approaches for generating synthetic clinical trial data disregard the usage requirements of these data, including maintaining specific properties of clinical outcomes, and only use post hoc assessments that are not coupled with the data generation process. In this paper, we propose SynRL which leverages reinforcement learning to improve the performance of patient data generators by customizing the generated data to meet the user-specified requirements for synthetic data outcomes and endpoints. Our method includes a data value critic function to evaluate the quality of the generated data and uses reinforcement learning to align the data generator with the users' needs based on the critic's feedback. We performed experiments on four clinical trial datasets and demonstrated the advantages of SynRL in improving the quality of the generated synthetic data while keeping the privacy risks low. We also show that SynRL can be utilized as a general framework that can customize data generation of multiple types of synthetic data generators. Our code is available at https://anonymous.4open.science/r/SynRL-DB0F/.
- North America > United States > Illinois (0.04)
- North America > United States > New York (0.04)
- Europe > France (0.04)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
TrialSynth: Generation of Synthetic Sequential Clinical Trial Data
Gao, Chufan, Beigi, Mandis, Shafquat, Afrah, Aptekar, Jacob, Sun, Jimeng
Analyzing data from past clinical trials is part of the ongoing effort to optimize the design, implementation, and execution of new clinical trials and more efficiently bring life-saving interventions to market. While there have been recent advances in the generation of static context synthetic clinical trial data, due to both limited patient availability and constraints imposed by patient privacy needs, the generation of fine-grained synthetic time-sequential clinical trial data has been challenging. Given that patient trajectories over an entire clinical trial are of high importance for optimizing trial design and efforts to prevent harmful adverse events, there is a significant need for the generation of high-fidelity time-sequence clinical trial data. Here we introduce TrialSynth, a Variational Autoencoder (VAE) designed to address the specific challenges of generating synthetic time-sequence clinical trial data. Distinct from related clinical data VAE methods, the core of our method leverages Hawkes Processes (HP), which are particularly well-suited for modeling event-type and time gap prediction needed to capture the structure of sequential clinical trial data. Our experiments demonstrate that TrialSynth surpasses the performance of other comparable methods that can generate sequential clinical trial data, in terms of both fidelity and in enabling the generation of highly accurate event sequences across multiple real-world sequential event datasets with small patient source populations when using minimal external information. Notably, our empirical findings highlight that TrialSynth not only outperforms existing clinical sequence-generating methods but also produces data with superior utility while empirically preserving patient privacy.
- North America > United States > Illinois > Champaign County > Urbana (0.04)
- North America > United States > Virginia (0.04)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
SemEval-2023 Task 7: Multi-Evidence Natural Language Inference for Clinical Trial Data
Jullien, Maël, Valentino, Marco, Frost, Hannah, O'Regan, Paul, Landers, Donal, Freitas, André
This paper describes the results of SemEval 2023 task 7 -- Multi-Evidence Natural Language Inference for Clinical Trial Data (NLI4CT) -- consisting of 2 tasks, a Natural Language Inference (NLI) task, and an evidence selection task on clinical trial data. The proposed challenges require multi-hop biomedical and numerical reasoning, which are of significant importance to the development of systems capable of large-scale interpretation and retrieval of medical evidence, to provide personalized evidence-based care. Task 1, the entailment task, received 643 submissions from 40 participants, and Task 2, the evidence selection task, received 364 submissions from 23 participants. The tasks are challenging, with the majority of submitted systems failing to significantly outperform the majority class baseline on the entailment task, and we observe significantly better performance on the evidence selection task than on the entailment task. Increasing the number of model parameters leads to a direct increase in performance, far more significant than the effect of biomedical pre-training. Future works could explore the limitations of large models for generalization and numerical inference, and investigate methods to augment clinical datasets to allow for more rigorous testing and to facilitate fine-tuning. We envisage that the dataset, models, and results of this task will be useful to the biomedical NLI and evidence retrieval communities. The dataset, competition leaderboard, and website are publicly available.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > Canada > Ontario > Toronto (0.07)
- North America > United States > Maryland > Montgomery County > Gaithersburg (0.04)
- (3 more...)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Health & Medicine > Therapeutic Area > Oncology (0.68)
TrialGraph: Machine Intelligence Enabled Insight from Graph Modelling of Clinical Trials
Yacoumatos, Christopher, Bragaglia, Stefano, Kanakia, Anshul, Svangård, Nils, Mangion, Jonathan, Donoghue, Claire, Weatherall, Jim, Khan, Faisal M., Shameer, Khader
A major impediment to successful drug development is the complexity, cost, and scale of clinical trials. The detailed internal structure of clinical trial data can make conventional optimization difficult to achieve. Recent advances in machine learning, specifically graph-structured data analysis, have the potential to enable significant progress in improving the clinical trial design. TrialGraph seeks to apply these methodologies to produce a proof-of-concept framework for developing models which can aid drug development and benefit patients. In this work, we first introduce a curated clinical trial data set compiled from the CT.gov, AACT and TrialTrove databases (n=1191 trials; representing one million patients) and describe the conversion of this data to graph-structured formats. We then detail the mathematical basis and implementation of a selection of graph machine learning algorithms, which typically use standard machine classifiers on graph data embedded in a low-dimensional feature space. We trained these models to predict side effect information for a clinical trial given information on the disease, existing medical conditions, and treatment. The MetaPath2Vec algorithm performed exceptionally well, with standard Logistic Regression, Decision Tree, Random Forest, Support Vector, and Neural Network classifiers exhibiting typical ROC-AUC scores of 0.85, 0.68, 0.86, 0.80, and 0.77, respectively. Remarkably, the best performing classifiers could only produce typical ROC-AUC scores of 0.70 when trained on equivalent array-structured data. Our work demonstrates that graph modelling can significantly improve prediction accuracy on appropriate datasets. Successive versions of the project that refine modelling assumptions and incorporate more data types can produce excellent predictors with real-world applications in drug development.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > Sweden > Vaestra Goetaland > Gothenburg (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.70)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)
Machine Learning Predicts Outcomes of Phase III Clinical Trials for Prostate Cancer
Currently, precision medicine in real-world clinical practice is mainly associated with treatment based on cancer subtype and genotype. For example, olaparib is a monotherapy for ovarian cancer in women with BRCA1/2 mutations [2]. However, there are still few examples of real-world precision medicine. Current clinical practice still relies heavily on subjective judgment and limited individual patient data [3]. A'one-drug-fits-all' approach is often used, in which a particular diagnosis leads to a specific type of treatment. Alternatively, trial-and-error practices are common, in which various treatment options are tried in the hope that one will work.
- Health & Medicine > Therapeutic Area > Oncology > Prostate Cancer (0.71)
- Health & Medicine > Therapeutic Area > Oncology > Ovarian Cancer (0.57)
From Buzzword to Clinical Tool: Setting the Record Straight on AI in the Life Sciences
Artificial intelligence (AI) far too often pops up as a term used vaguely to refer to any process that appears to involve more computers than it did twenty years ago. But concrete examples of how this informatics technology can improve fields like life sciences are harder to come by. I recently spoke to Krishnan Nandabalan, founder, CEO and president of InveniAI, which aims to use AI techniques to more quickly identify pharmacological compounds and get drugs to patients faster, to get the full picture. Ruairi Mackenzie (RM): How would you like to set the record straight about AI in the life sciences? Krishnan Nandabalan (KN): AI is used as a buzzword now.
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Health & Medicine > Therapeutic Area > Neurology (0.48)
Partnering in a digital era
In today's era of personalised medicine, healthcare has evolved from mass treatments, which aren't effective for all patients, to medicines specifically targeted to patient groups based on companion diagnostic tests. Now, with the advent of more sophisticated digital technologies, personalised healthcare is entering a new phase, expanding from companion diagnostics to a more complex, holistic view of patient health generated from a wide variety of data sources. This web of data will require a new ecosystem of partnerships with healthcare and technology companies. "In the future, we will be using data for a variety of patient characteristics to determine the best combination of treatments to improve a person's overall healthcare," says Michele Pedrocchi, Head of Global Strategy and Business Development for Roche Diagnostics. Data about patients and medicines are already streaming in from many sources--in vivo diagnostics, lifestyle sensors, labs, electronic records, clinical trial data, genomic data, physicians, and patients themselves.
5 pillars of AI innovation over the past 40 years
Artificial intelligence came alive in the 80s with many startups, governments, and large enterprises deploying new systems that executed tasks typically performed by human experts. These were largely rule-based systems that encoded behaviors in rules versus the strict procedural logic of traditional programming languages. Then, as memory became more affordable, systems were able to handle much more computationally-intense tasks such as machine learning, planning and scheduling, and natural language understanding. Now in the age of Big Data, many believe AI has completely changed the tech landscape, but in some ways, as the Talking Heads song goes, it's the "same as it ever was". What remains the same are the core elements of an intelligent application.
- North America > United States (0.31)
- Asia > China (0.05)
- Information Technology (1.00)
- Government > Space Agency (0.31)
- Government > Regional Government > North America Government > United States Government (0.31)
A Noise-Filtering Approach for Cancer Drug Sensitivity Prediction
Accurately predicting drug responses to cancer is an important problem hindering oncologists' efforts to find the most effective drugs to treat cancer, which is a core goal in precision medicine. The scientific community has focused on improving this prediction based on genomic, epigenomic, and proteomic datasets measured in human cancer cell lines. Real-world cancer cell lines contain noise, which degrades the performance of machine learning algorithms. This problem is rarely addressed in the existing approaches. In this paper, we present a noise-filtering approach that integrates techniques from numerical linear algebra and information retrieval targeted at filtering out noisy cancer cell lines. By filtering out noisy cancer cell lines, we can train machine learning algorithms on better quality cancer cell lines. We evaluate the performance of our approach and compare it with an existing approach using the Area Under the ROC Curve (AUC) on clinical trial data. The experimental results show that our proposed approach is stable and also yields the highest AUC at a statistically significant level.
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > New Jersey (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.71)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)