biomedical research
From Task Executors to Research Partners: Evaluating AI Co-Pilots Through Workflow Integration in Biomedical Research
Weidener, Lukas, Brkić, Marko, Bacci, Chiara, Jovanović, Mihailo, Ulgac, Emre, Dobrin, Alex, Weniger, Johannes, Vlas, Martin, Singh, Ritvik, Meduri, Aakaash
Artificial intelligence systems are increasingly deployed in biomedical research. However, current evaluation frameworks may inadequately assess their effectiveness as research collaborators. This rapid review examines benchmarking practices for AI systems in preclinical biomedical research. Three major databases and two preprint servers were searched from January 1, 2018 to October 31, 2025, identifying 14 benchmarks that assess AI capabilities in literature understanding, experimental design, and hypothesis generation. The results revealed that all current benchmarks assess isolated component capabilities, including data analysis quality, hypothesis validity, and experimental protocol design. However, authentic research collaboration requires integrated workflows spanning multiple sessions, with contextual memory, adaptive dialogue, and constraint propagation. This gap implies that systems excelling on component benchmarks may fail as practical research co-pilots. A process-oriented evaluation framework is proposed that addresses four critical dimensions absent from current benchmarks: dialogue quality, workflow orchestration, session continuity, and researcher experience. These dimensions are essential for evaluating AI systems as research co-pilots rather than as isolated task executors.
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Colorado (0.04)
- (4 more...)
- Workflow (1.00)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Research Report > Strength High (0.67)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- (6 more...)
BMDetect: A Multimodal Deep Learning Framework for Comprehensive Biomedical Misconduct Detection
Zhou, Yize, Zhang, Jie, Wang, Meijie, Yu, Lun
Academic misconduct detection in biomedical research remains challenging due to algorithmic narrowness in existing methods and fragmented analytical pipelines. We present BMDetect, a multimodal deep learning framework that integrates journal metadata (SJR, institutional data), semantic embeddings (PubMedBERT), and GPT-4o-mined textual attributes (methodological statistics, data anomalies) for holistic manuscript evaluation. Key innovations include: (1) multimodal fusion of domain-specific features to reduce detection bias; (2) quantitative evaluation of feature importance, identifying journal authority metrics (e.g., SJR-index) and textual anomalies (e.g., statistical outliers) as dominant predictors; and (3) the BioMCD dataset, a large-scale benchmark with 13,160 retracted articles and 53,411 controls. BMDetect achieves 74.33% AUC, outperforming single-modality baselines by 8.6%, and demonstrates transferability across biomedical subfields. This work advances scalable, interpretable tools for safeguarding research integrity.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Asia > Japan (0.04)
- Asia > China > Hubei Province > Wuhan (0.04)
- (4 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.68)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Education (1.00)
- Health & Medicine > Therapeutic Area (0.68)
STELLA: Self-Evolving LLM Agent for Biomedical Research
Jin, Ruofan, Zhang, Zaixi, Wang, Mengdi, Cong, Le
Modern biomedical research is defined by both immense opportunity and staggering complexity. As a cornerstone of science, it generates vast quantities of data from large-scale experiments, but this progress is hampered by a research landscape that is profoundly fragmented (1-3). The knowledge, specialized software, and databases required to make discoveries are numerous, constantly evolving, and dispersed, forcing researchers to expend significant time and effort on the manual and labor-intensive task of discovering, learning, and integrating these disparate resources. While the advent of AI agents holds the promise of automating this intricate work (4-6), current systems inherit a critical limitation: they typically rely on manually curated, static toolsets (7-14). This approach is inefficient, fails to scale, and cannot keep pace with the rapid evolution of biomedical science, leaving the agents perpetually behind the cutting edge. This raises a critical question: Can we design a self-evolving agent that transcends these limitations by automatically discovering and integrating new tools, continuously updating its knowledge base, and iteratively upgrading its own capabilities through direct experience? Here we present STELLA, a generalist biomedical AI agent designed around the core principle of self-evolution (15). STELLA learns and improves from every problem it solves, continuously enhancing its own reasoning strategies and technical abilities.
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- North America > United States > California > Santa Clara County > Stanford (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.97)
BioDSA-1K: Benchmarking Data Science Agents for Biomedical Research
Wang, Zifeng, Danek, Benjamin, Sun, Jimeng
Validating scientific hypotheses is a central challenge in biomedical research, and remains difficult for artificial intelligence (AI) agents due to the complexity of real-world data analysis and evidence interpretation. In this work, we present BioDSA-1K, a benchmark designed to evaluate AI agents on realistic, data-driven biomedical hypothesis validation tasks. BioDSA-1K consists of 1,029 hypothesis-centric tasks paired with 1,177 analysis plans, curated from over 300 published biomedical studies to reflect the structure and reasoning found in authentic research workflows. Each task includes a structured hypothesis derived from the original study's conclusions, expressed in the affirmative to reflect the language of scientific reporting, and one or more pieces of supporting evidence grounded in empirical data tables. While these hypotheses mirror published claims, they remain testable using standard statistical or machine learning methods. The benchmark enables evaluation along four axes: (1) hypothesis decision accuracy, (2) alignment between evidence and conclusion, (3) correctness of the reasoning process, and (4) executability of the AI-generated analysis code. Importantly, BioDSA-1K includes non-verifiable hypotheses: cases where the available data are insufficient to support or refute a claim, reflecting a common yet underexplored scenario in real-world science. We propose BioDSA-1K as a foundation for building and evaluating generalizable, trustworthy AI agents for biomedical discovery.
- North America > United States > Illinois > Champaign County > Urbana (0.04)
- Asia > China (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Object Tracking in a $360^o$ View: A Novel Perspective on Bridging the Gap to Biomedical Advancements
Fazli, Mojtaba S., Quinn, Shannon
Object tracking is a fundamental tool in modern innovation, with applications in defense systems, autonomous vehicles, and biomedical research. It enables precise identification, monitoring, and spatiotemporal analysis of objects across sequential frames, providing insights into dynamic behaviors. In cell biology, object tracking is vital for uncovering cellular mechanisms, such as migration, interactions, and responses to drugs or pathogens. These insights drive breakthroughs in understanding disease progression and therapeutic interventions. Over time, object tracking methods have evolved from traditional feature-based approaches to advanced machine learning and deep learning frameworks. While classical methods are reliable in controlled settings, they struggle in complex environments with occlusions, variable lighting, and high object density. Deep learning models address these challenges by delivering greater accuracy, adaptability, and robustness. This review categorizes object tracking techniques into traditional, statistical, feature-based, and machine learning paradigms, with a focus on biomedical applications. These methods are essential for tracking cells and subcellular structures, advancing our understanding of health and disease. Key performance metrics, including accuracy, efficiency, and adaptability, are discussed. The paper explores limitations of current methods and highlights emerging trends to guide the development of next-generation tracking systems for biomedical research and broader scientific domains.
- North America > United States > Georgia > Clarke County > Athens (0.14)
- North America > United States > California > Santa Clara County > Palo Alto (0.13)
- North America > United States > California > San Francisco County > San Francisco (0.13)
- (12 more...)
- Workflow (1.00)
- Research Report > Promising Solution (1.00)
- Overview (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)
Detection of fields of applications in biomedical abstracts with the support of argumentation elements
Focusing on particular facts, instead of the complete text, can potentially improve searching for specific information in the scientific literature. In particular, argumentative elements allow focusing on specific parts of a publication, e.g., the background section or the claims from the authors. We evaluated some tools for the extraction of argumentation elements for a specific task in biomedicine, namely, for detecting the fields of the application in a biomedical publication, e.g, whether it addresses the problem of disease diagnosis or drug development. We performed experiments with the PubMedBERT pre-trained model, which was fine-tuned on a specific corpus for the task. We compared the use of title and abstract to restricting to only some argumentative elements. The top F1 scores ranged from 0.22 to 0.84, depending on the field of application. The best argumentative labels were the ones related the conclusion and background sections of an abstract.
- Europe > Romania (0.05)
- Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- (4 more...)
DiscoverPath: A Knowledge Refinement and Retrieval System for Interdisciplinarity on Biomedical Research
Chuang, Yu-Neng, Wang, Guanchu, Chang, Chia-Yuan, Lai, Kwei-Herng, Zha, Daochen, Tang, Ruixiang, Yang, Fan, Reyes, Alfredo Costilla, Zhou, Kaixiong, Jiang, Xiaoqian, Hu, Xia
The exponential growth in scholarly publications necessitates advanced tools for efficient article retrieval, especially in interdisciplinary fields where diverse terminologies are used to describe similar research. Traditional keyword-based search engines often fall short in assisting users who may not be familiar with specific terminologies. To address this, we present a knowledge graph-based paper search engine for biomedical research to enhance the user experience in discovering relevant queries and articles. The system, dubbed DiscoverPath, employs Named Entity Recognition (NER) and part-of-speech (POS) tagging to extract terminologies and relationships from article abstracts to create a KG. To reduce information overload, DiscoverPath presents users with a focused subgraph containing the queried entity and its neighboring nodes and incorporates a query recommendation system, enabling users to iteratively refine their queries. The system is equipped with an accessible Graphical User Interface that provides an intuitive visualization of the KG, query recommendations, and detailed article information, enabling efficient article retrieval, thus fostering interdisciplinary knowledge exploration. DiscoverPath is open-sourced at https://github.com/ynchuang/DiscoverPath.
Balancing Privacy and Progress in Artificial Intelligence: Anonymization in Histopathology for Biomedical Research and Education
Kanwal, Neel, Janssen, Emiel A. M., Engan, Kjersti
The advancement of biomedical research heavily relies on access to large amounts of medical data. In the case of histopathology, Whole Slide Images (WSI) and clinicopathological information are valuable for developing Artificial Intelligence (AI) algorithms for Digital Pathology (DP). Transferring medical data "as open as possible" enhances the usability of the data for secondary purposes but poses a risk to patient privacy. At the same time, existing regulations push towards keeping medical data "as closed as necessary" to avoid re-identification risks. Generally, these legal regulations require the removal of sensitive data but do not consider the possibility of data linkage attacks due to modern image-matching algorithms. In addition, the lack of standardization in DP makes it harder to establish a single solution for all formats of WSIs. These challenges raise problems for bio-informatics researchers in balancing privacy and progress while developing AI algorithms. This paper explores the legal regulations and terminologies for medical data-sharing. We review existing approaches and highlight challenges from the histopathological perspective. We also present a data-sharing guideline for histological data to foster multidisciplinary research and education.
- Europe > Norway > Western Norway > Rogaland > Stavanger (0.05)
- North America > United States > Washington (0.04)
- North America > United States > New York (0.04)
- (4 more...)
- Law > Statutes (1.00)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine > Health Care Providers & Services (0.70)
Interpretability from a new lens: Integrating Stratification and Domain knowledge for Biomedical Applications
Onoja, Anthony, Raimondi, Francesco
The use of machine learning (ML) techniques in the biomedical field has become increasingly important, particularly with the large amounts of data generated by the aftermath of the COVID-19 pandemic. However, due to the complex nature of biomedical datasets and the use of black-box ML models, a lack of trust and adoption by domain experts can arise. In response, interpretable ML (IML) approaches have been developed, but the curse of dimensionality in biomedical datasets can lead to model instability. This paper proposes a novel computational strategy for the stratification of biomedical problem datasets into k-fold cross-validation (CVs) and integrating domain knowledge interpretation techniques embedded into the current state-of-the-art IML frameworks. This approach can improve model stability, establish trust, and provide explanations for outcomes generated by trained IML models. Specifically, the model outcome, such as aggregated feature weight importance, can be linked to further domain knowledge interpretations using techniques like pathway functional enrichment, drug targeting, and repurposing databases. Additionally, involving end-users and clinicians in focus group discussions before and after the choice of IML framework can help guide testable hypotheses, improve performance metrics, and build trustworthy and usable IML solutions in the biomedical field. Overall, this study highlights the potential of combining advanced computational techniques with domain knowledge interpretation to enhance the effectiveness of IML solutions in the context of complex biomedical datasets.
- Asia > Japan > Kyūshū & Okinawa > Kyūshū > Kumamoto Prefecture > Kumamoto (0.04)
- North America > United States (0.04)
- Europe > United Kingdom > England > Surrey (0.04)
- (3 more...)
Applied Machine Learning Engineer
Genomics England partners with the NHS to provide whole genome sequencing diagnostics. We also equip researchers to find the causes of disease and develop new treatments – with patients and participants at the heart of it all. Our mission is to continue refining, scaling, and evolving our ability to enable others to deliver genomic healthcare and conduct genomic research. We are accelerating our impact and working with patients, doctors, scientists, government and industry to improve genomic testing, and help researchers access the health data and technology they need to make new medical discoveries and create more effective, targeted medicines for everybody. Our Applied Machine Learning Engineers solve challenging predictive modelling problems for improving health care and enabling biomedical research.