Enhancing Data Quality in Federated Fine-Tuning of Foundation Models
Zhao, Wanru, Du, Yaxin, Lane, Nicholas Donald, Chen, Siheng, Wang, Yanfeng
–arXiv.org Artificial Intelligence
The PubMedQA task is designed to answer research questions with responses categorized as yes/no/maybe, effectively framing it as a multiple-choice question format. The dataset is divided into three subsets: 1,000 manually labeled question-answer pairs (denoted as PQA-L), 61,200 unlabeled pairs (PQA-U), and 211,300 pairs that have been artificially generated (PQA-A). Consistent with previous studies (Diao et al., 2023; Singhal et al., 2023), we employ the PQA-L subset as the test set for evaluating the model's performance. USMLE USMLE (Jin et al., 2021) consists of multiple-choice questions (with 4 choices per question) that are based on the United States Medical Licensing Exams. This dataset has been compiled from questions used in professional medical board examinations and is unique in its multilingual composition, including English, Simplified Chinese, and Traditional Chinese versions. It contains 12,724 questions in English, 34,251 in Simplified Chinese, and 14,123 in Traditional Chinese. For our purposes, we focus on the English component of the dataset, which is further divided into 10,178 questions for the training set, 1,273 for the validation set, and 1,273 for the test set, adhering to the official distribution of the dataset.
arXiv.org Artificial Intelligence
Mar-7-2024
- Country:
- North America > United States (0.88)
- Genre:
- Research Report > New Finding (0.48)
- Industry:
- Education (1.00)
- Health & Medicine > Therapeutic Area (1.00)
- Information Technology > Security & Privacy (0.93)
- Technology: