Mehandru, Nikita
BioAgents: Democratizing Bioinformatics Analysis with Multi-Agent Systems
Mehandru, Nikita, Hall, Amanda K., Melnichenko, Olesya, Dubinina, Yulia, Tsirulnikov, Daniel, Bamman, David, Alaa, Ahmed, Saponas, Scott, Malladi, Venkat S.
Creating end-to-end bioinformatics workflows requires diverse domain expertise, which poses challenges for both junior and senior researchers as it demands a deep understanding of both genomics concepts and computational techniques. While large language models (LLMs) provide some assistance, they often fall short in providing the nuanced guidance needed to execute complex bioinformatics tasks, and require expensive computing resources to achieve high performance. We thus propose a multi-agent system built on small language models, fine-tuned on bioinformatics data, and enhanced with retrieval augmented generation (RAG). Our system, BioAgents, enables local operation and personalization using proprietary data. We observe performance comparable to human experts on conceptual genomics tasks, and suggest next steps to enhance code generation capabilities. Large language models (LLMs) have been applied to various domain-specific contexts, including scientific discovery in medicine [45, 49, 56], chemistry [6, 7], and biotechnology [31]. Recent advances in LLMs have spurred their use in bioinformatics [13], a field encompassing data-intensive tasks such as genome sequencing, protein structure prediction, and pathway analysis. One of the most significant applications has been AlphaFold3, which uses transformer architecture with triangular attention to predict a protein's three-dimensional (3-D) structure from amino acid sequences [2].
Recent Advances, Applications, and Open Challenges in Machine Learning for Health: Reflections from Research Roundtables at ML4H 2023 Symposium
Jeong, Hyewon, Jabbour, Sarah, Yang, Yuzhe, Thapta, Rahul, Mozannar, Hussein, Han, William Jongwon, Mehandru, Nikita, Wornow, Michael, Lialin, Vladislav, Liu, Xin, Lozano, Alejandro, Zhu, Jiacheng, Kocielnik, Rafal Dariusz, Harrigian, Keith, Zhang, Haoran, Lee, Edward, Vukadinovic, Milos, Balagopalan, Aparna, Jeanselme, Vincent, Matton, Katherine, Demirel, Ilker, Fries, Jason, Rashidi, Parisa, Beaulieu-Jones, Brett, Xu, Xuhai Orson, McDermott, Matthew, Naumann, Tristan, Agrawal, Monica, Zitnik, Marinka, Ustun, Berk, Choi, Edward, Yeom, Kristen, Gursoy, Gamze, Ghassemi, Marzyeh, Pierson, Emma, Chen, George, Kanjilal, Sanjat, Oberst, Michael, Zhang, Linying, Singh, Harvineet, Hartvigsen, Tom, Zhou, Helen, Okolo, Chinasa T.
The third ML4H symposium was held in person on December 10, 2023, in New Orleans, Louisiana, USA. The symposium included research roundtable sessions to foster discussions between participants and senior researchers on timely and relevant topics for the \ac{ML4H} community. Encouraged by the successful virtual roundtables in the previous year, we organized eleven in-person roundtables and four virtual roundtables at ML4H 2022. The organization of the research roundtables at the conference involved 17 Senior Chairs and 19 Junior Chairs across 11 tables. Each roundtable session included invited senior chairs (with substantial experience in the field), junior chairs (responsible for facilitating the discussion), and attendees from diverse backgrounds with interest in the session's topic. Herein we detail the organization process and compile takeaways from these roundtable discussions, including recent advances, applications, and open challenges for each topic. We conclude with a summary and lessons learned across all roundtables. This document serves as a comprehensive review paper, summarizing the recent advancements in machine learning for healthcare as contributed by foremost researchers in the field.
Physician Detection of Clinical Harm in Machine Translation: Quality Estimation Aids in Reliance and Backtranslation Identifies Critical Errors
Mehandru, Nikita, Agrawal, Sweta, Xiao, Yimin, Khoong, Elaine C, Gao, Ge, Carpuat, Marine, Salehi, Niloufar
A major challenge in the practical use of Machine Translation (MT) is that users lack guidance to make informed decisions about when to rely on outputs. Progress in quality estimation research provides techniques to automatically assess MT quality, but these techniques have primarily been evaluated in vitro by comparison against human judgments outside of a specific context of use. This paper evaluates quality estimation feedback in vivo with a human study simulating decision-making in high-stakes medical settings. Using Emergency Department discharge instructions, we study how interventions based on quality estimation versus backtranslation assist physicians in deciding whether to show MT outputs to a patient. We find that quality estimation improves appropriate reliance on MT, but backtranslation helps physicians detect more clinically harmful errors that QE alone often misses.
Large Language Models as Agents in the Clinic
Mehandru, Nikita, Miao, Brenda Y., Almaraz, Eduardo Rodriguez, Sushil, Madhumita, Butte, Atul J., Alaa, Ahmed
Recent developments in large language models (LLMs) have unlocked new opportunities for healthcare, from information synthesis to clinical decision support. These new LLMs are not just capable of modeling language, but can also act as intelligent "agents" that interact with stakeholders in open-ended conversations and even influence clinical decision-making. Rather than relying on benchmarks that measure a model's ability to process clinical data or answer standardized test questions, LLM agents should be assessed for their performance on real-world clinical tasks. These new evaluation frameworks, which we call "Artificial-intelligence Structured Clinical Examinations" ("AI-SCI"), can draw from comparable technologies where machines operate with varying degrees of self-governance, such as self-driving cars. High-fidelity simulations may also be used to evaluate interactions between users and LLMs within a clinical workflow, or to model the dynamic interactions of multiple LLMs. Developing these robust, real-world clinical evaluations will be crucial towards deploying LLM agents into healthcare.