TrialBench: Multi-Modal Artificial Intelligence-Ready Clinical Trial Datasets
Chen, Jintai, Hu, Yaojun, Wang, Yue, Lu, Yingzhou, Cao, Xu, Lin, Miao, Xu, Hongxia, Wu, Jian, Xiao, Cao, Sun, Jimeng, Glass, Lucas, Huang, Kexin, Zitnik, Marinka, Fu, Tianfan
–arXiv.org Artificial Intelligence
Clinical trials are pivotal for developing new medical treatments, yet they typically pose some risks such as patient mortality, adverse events, and enrollment failure that waste immense efforts spanning over a decade. Applying artificial intelligence (AI) to forecast or simulate key events in clinical trials holds great potential for providing insights to guide trial designs. However, complex data collection and question definition requiring medical expertise and a deep understanding of trial designs have hindered the involvement of AI thus far. This paper tackles these challenges by presenting a comprehensive suite of meticulously curated AIready datasets covering multi-modal data (e.g., drug molecule, disease code, text, categorical/numerical features) and 8 crucial prediction challenges in clinical trial design, encompassing prediction of trial duration, patient dropout rate, serious adverse event, mortality rate, trial approval outcome, trial failure reason, drug dose finding, design of eligibility criteria. Furthermore, we provide basic validation methods for each task to ensure the datasets' usability and reliability. We anticipate that the availability of such open-access datasets will catalyze the development of advanced AI approaches for clinical trial design, ultimately advancing clinical trial research and accelerating medical solution development.
arXiv.org Artificial Intelligence
Jun-30-2024
- Country:
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Genre:
- Research Report
- Experimental Study (1.00)
- New Finding (1.00)
- Research Report
- Industry:
- Technology:
- Information Technology
- Artificial Intelligence
- Applied AI (1.00)
- Machine Learning
- Neural Networks > Deep Learning (1.00)
- Performance Analysis > Accuracy (1.00)
- Statistical Learning (0.93)
- Natural Language > Large Language Model (0.93)
- Representation & Reasoning (0.93)
- Data Science > Data Mining (1.00)
- Artificial Intelligence
- Information Technology