Lin, Miao
TrialBench: Multi-Modal Artificial Intelligence-Ready Clinical Trial Datasets
Chen, Jintai, Hu, Yaojun, Wang, Yue, Lu, Yingzhou, Cao, Xu, Lin, Miao, Xu, Hongxia, Wu, Jian, Xiao, Cao, Sun, Jimeng, Glass, Lucas, Huang, Kexin, Zitnik, Marinka, Fu, Tianfan
Clinical trials are pivotal for developing new medical treatments, yet they typically pose some risks such as patient mortality, adverse events, and enrollment failure that waste immense efforts spanning over a decade. Applying artificial intelligence (AI) to forecast or simulate key events in clinical trials holds great potential for providing insights to guide trial designs. However, complex data collection and question definition requiring medical expertise and a deep understanding of trial designs have hindered the involvement of AI thus far. This paper tackles these challenges by presenting a comprehensive suite of meticulously curated AIready datasets covering multi-modal data (e.g., drug molecule, disease code, text, categorical/numerical features) and 8 crucial prediction challenges in clinical trial design, encompassing prediction of trial duration, patient dropout rate, serious adverse event, mortality rate, trial approval outcome, trial failure reason, drug dose finding, design of eligibility criteria. Furthermore, we provide basic validation methods for each task to ensure the datasets' usability and reliability. We anticipate that the availability of such open-access datasets will catalyze the development of advanced AI approaches for clinical trial design, ultimately advancing clinical trial research and accelerating medical solution development.
Cold-Start Heterogeneous-Device Wireless Localization
Zheng, Vincent W. (Advanced Digital Sciences Center) | Cao, Hong (McLaren Applied Technolgoies APAC) | Gao, Shenghua (ShanghaiTech University) | Adhikari, Aditi (Advanced Digital Sciences Center) | Lin, Miao (Institute for Infocomm Research, A*STAR) | Chang, Kevin Chen-Chuan (University of Illinois at Urbana-Champaign)
In this paper, we study a cold-start heterogeneous-devicelocalization problem. This problem is challenging, becauseit results in an extreme inductive transfer learning setting,where there is only source domain data but no target do-main data. This problem is also underexplored. As there is notarget domain data for calibration, we aim to learn a robustfeature representation only from the source domain. There islittle previous work on such a robust feature learning task; besides, the existing robust feature representation propos-als are both heuristic and inexpressive. As our contribution,we for the first time provide a principled and expressive robust feature representation to solve the challenging cold-startheterogeneous-device localization problem. We evaluate ourmodel on two public real-world data sets, and show that itsignificantly outperforms the best baseline by 23.1%–91.3%across four pairs of heterogeneous devices.