terminology
Human-in-the-Loop and AI: Crowdsourcing Metadata Vocabulary for Materials Science
Greenberg, Jane, McClellan, Scott, Ireland, Addy, Sammarco, Robert, Gerber, Colton, Rauch, Christopher B., Kelly, Mat, Kunze, John, An, Yuan, Toberer, Eric
Metadata vocabularies are essential for advancing FAIR and FARR data principles, but their development constrained by limited human resources and inconsistent standardization practices. This paper introduces MatSci-YAMZ, a platform that integrates artificial intelligence (AI) and human-in-the-loop (HILT), including crowdsourcing, to support metadata vocabulary development. The paper reports on a proof-of-concept use case evaluating the AI-HILT model in materials science, a highly interdisciplinary domain Six (6) participants affiliated with the NSF Institute for Data-Driven Dynamical Design (ID4) engaged with the MatSci-YAMZ plaform over several weeks, contributing term definitions and providing examples to prompt the AI-definitions refinement. Nineteen (19) AI-generated definitions were successfully created, with iterative feedback loops demonstrating the feasibility of AI-HILT refinement. Findings confirm the feasibility AI-HILT model highlighting 1) a successful proof of concept, 2) alignment with FAIR and open-science principles, 3) a research protocol to guide future studies, and 4) the potential for scalability across domains. Overall, MatSci-YAMZ's underlying model has the capacity to enhance semantic transparency and reduce time required for consensus building and metadata vocabulary development.
- Europe > Ireland (0.05)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- (4 more...)
- Research Report > New Finding (0.48)
- Research Report > Experimental Study (0.46)
Scalable Unit Harmonization in Medical Informatics via Bayesian-Optimized Retrieval and Transformer-Based Re-ranking
Objective: To develop and evaluate a scalable methodology for harmonizing inconsistent units in large-scale clinical datasets, addressing a key barrier to data interoperability. Materials and Methods: We designed a novel unit harmonization system combining BM25, sentence embeddings, Bayesian optimization, and a bidirectional transformer based binary classifier for retrieving and matching laboratory test entries. The system was evaluated using the Optum Clinformatics Datamart dataset (7.5 billion entries). We implemented a multi-stage pipeline: filtering, identification, harmonization proposal generation, automated re-ranking, and manual validation. Performance was assessed using Mean Reciprocal Rank (MRR) and other standard information retrieval metrics. Results: Our hybrid retrieval approach combining BM25 and sentence embeddings (MRR: 0.8833) significantly outperformed both lexical-only (MRR: 0.7985) and embedding-only (MRR: 0.5277) approaches. The transformer-based reranker further improved performance (absolute MRR improvement: 0.10), bringing the final system MRR to 0.9833. The system achieved 83.39\% precision at rank 1 and 94.66\% recall at rank 5. Discussion: The hybrid architecture effectively leverages the complementary strengths of lexical and semantic approaches. The reranker addresses cases where initial retrieval components make errors due to complex semantic relationships in medical terminology. Conclusion: Our framework provides an efficient, scalable solution for unit harmonization in clinical datasets, reducing manual effort while improving accuracy. Once harmonized, data can be reused seamlessly in different analyses, ensuring consistency across healthcare systems and enabling more reliable multi-institutional studies and meta-analyses.
- North America > United States > Mississippi > Marion County (0.04)
- North America > United States > Minnesota > Hennepin County > Eden Prairie (0.04)
- Indian Ocean > Red Sea (0.04)
- (8 more...)
We will fix all minor comments and typos without explicitly addressing them in the rebuttal
We will fix all minor comments and typos without explicitly addressing them in the rebuttal. Practical Impact: Our primary aim in this work is indeed theoretical. Appendix, but we will add a reference to it in the main paper. P AC T erminology: We have assumed that readers will be familiar with standard terminology from P AC learning. Non-trivial Class: The definition of non-trivial class appears just before the statement of Theorem 5 (in lines 182-183).
timely (R2, R3) and important
We thank the reviewers for their helpful comments. Reviewers noted that Grover generates "extremely credible" articles (R2) and that due We appreciate this point and will revisit the word choice. We haven't seen the model We believe that our "novel way to guide generation" makes Grover novel, not just an Indeed, GPT(2), BERT, XLnet, and Grover share the same backbone but learn from different objectives. What is given to the turkers? For overall trustworthiness for instance, we asked "Does the article read like it comes "It takes a thief to catch a thief"?
some of them (like what network structure or loss function tend to cause A Vs, and what other new theoretical results
First of all, we would like to thank all reviewers for the insightful comments and suggestions! Optimization landscape analysis is an important research topic in deep learning. What can be explained by A Vs but not symmetric valleys (SVs). This seemingly contradictory observation can be well explained by A Vs, but not SVs. To be conservative, we used the word "decent probability" in our paper.
have extended our empirical proof of maximal informativeness to k = 15
We thank the reviewers for the thought-invoking questions and helpful comments on improving the manuscript. The LLW hinge loss is calibrated with respect to the 0-1 loss while the WW hinge loss is not. The LLW SVM performs worse for a reason unrelated to calibration. Do ˇ gan et al. [2016] on their page 20 gave an explanation for the worse performance of all Hence, the poor performance of LLW is a consequence of using absolute margin. R2, R3 & R4: Why is consistency with respect to the ordered partition loss desirable?
6275d7071d005260ab9d0766d6df1145-AuthorFeedback.pdf
We agree O'Reilly's work is highly relevant and we should have We will also link to the repository containing code for replicating all results. We had investigated several alternatives like this before settling on our metric. We found this normalization would make the 2D-map visualization unintuitive. We will clarify these points in the paper. Second, however, it is well known that CHL [Eq.