AITopics | patient note

Collaborating Authors

patient note

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

99e81750f3fdfcaf9613db2dbf4bd623-Paper-Datasets_and_Benchmarks_Track.pdf

Neural Information Processing SystemsFeb-16-2026, 21:44:18 GMT

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > Virginia (0.04)
North America > United States > Illinois > Champaign County > Urbana (0.04)
(3 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.68)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine (0.93)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

Supplementary Materials for M

Neural Information Processing SystemsOct-10-2025, 10:52:08 GMT

The attribute extractions of these notes were then verified for by authors of this paper.

calculator, dataset, patient note, (15 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Nephrology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

99e81750f3fdfcaf9613db2dbf4bd623-Paper-Datasets_and_Benchmarks_Track.pdf

Neural Information Processing SystemsOct-10-2025, 10:52:05 GMT

arxiv preprint arxiv, calculator, llm, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > Virginia (0.04)
North America > United States > Illinois > Champaign County > Urbana (0.04)
(3 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.68)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine (0.93)
Health & Medicine > Pharmaceuticals & Biotechnology (0.68)
Health & Medicine > Therapeutic Area > Hematology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

The Aloe Family Recipe for Open and Specialized Healthcare LLMs

Garcia-Gasulla, Dario, Bayarri-Planas, Jordi, Gururajan, Ashwin Kumar, Lopez-Cuena, Enrique, Tormos, Adrian, Hinjos, Daniel, Bernabeu-Perez, Pablo, Arias-Duart, Anna, Martin-Torres, Pablo Agustin, Gonzalez-Mallo, Marta, Alvarez-Napagao, Sergio, Ayguadé-Parra, Eduard, Cortés, Ulises

arXiv.org Artificial IntelligenceMay-30-2025

Purpose: With advancements in Large Language Models (LLMs) for healthcare, the need arises for competitive open-source models to protect the public interest. This work contributes to the field of open medical LLMs by optimizing key stages of data preprocessing and training, while showing how to improve model safety (through DPO) and efficacy (through RAG). The evaluation methodology used, which includes four different types of tests, defines a new standard for the field. The resultant models, shown to be competitive with the best private alternatives, are released with a permisive license. Methods: Building on top of strong base models like Llama 3.1 and Qwen 2.5, Aloe Beta uses a custom dataset to enhance public data with synthetic Chain of Thought examples. The models undergo alignment with Direct Preference Optimization, emphasizing ethical and policy-aligned performance in the presence of jailbreaking attacks. Evaluation includes close-ended, open-ended, safety and human assessments, to maximize the reliability of results. Results: Recommendations are made across the entire pipeline, backed by the solid performance of the Aloe Family. These models deliver competitive performance across healthcare benchmarks and medical fields, and are often preferred by healthcare professionals. On bias and toxicity, the Aloe Beta models significantly improve safety, showing resilience to unseen jailbreaking attacks. For a responsible release, a detailed risk assessment specific to healthcare is attached to the Aloe Family models. Conclusion: The Aloe Beta models, and the recipe that leads to them, are a significant contribution to the open-source medical LLM field, offering top-of-the-line performance while maintaining high ethical requirements. This work sets a new standard for developing and reporting aligned LLMs in healthcare.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2505.04388

Country:

Europe (1.00)
North America > United States (0.92)

Genre:

Workflow (1.00)
Research Report > Experimental Study (0.45)

Industry:

Law > Criminal Law (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Information Technology > Security & Privacy (1.00)
(7 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

CMQCIC-Bench: A Chinese Benchmark for Evaluating Large Language Models in Medical Quality Control Indicator Calculation

Yu, Guangya, Li, Yanhao, Jiang, Zongying, Jin, Yuxiong, Dai, Li, Lin, Yupian, Hou, Ruihui, Zhang, Weiyan, Fan, Yongqi, Ye, Qi, Liu, Jingping, Ruan, Tong

arXiv.org Artificial IntelligenceFeb-17-2025

Medical quality control indicators are essential to assess the qualifications of healthcare institutions for medical services. With the impressive performance of large language models (LLMs) like GPT-4 in the medical field, leveraging these technologies for the Medical Quality Control Indicator Calculation (MQCIC) presents a promising approach. In this work, (1) we introduce a real-world task MQCIC and propose an open-source Chinese electronic medical records (EMRs)-based dataset (CMQCIC-Bench) comprising 785 instances and 76 indicators. (2) We propose a semi-automatic method to enhance the rule representation. Then we propose the Clinical Facts-based Inferential Rule (CF-IR) method that disentangles the clinical fact verification and inferential rule reasoning actions. (3) We conduct comprehensive experiments on 20 representative LLMs, covering general and medical models. Our findings reveal that CF-IR outperforms Chain-of-Thought methods in MQCIC tasks. (4) We conduct an error analysis and investigate the capabilities of clinical fact verification and inferential rule reasoning, providing insights to improve performance in the MQCIC further. The dataset and code is available in this repo https://anonymous.4open.science/r/C-MQCIC-1151.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2502.11703

Country:

Asia (0.68)
North America > United States (0.46)

Genre: Research Report > New Finding (0.87)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Health Care Technology > Medical Record (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Controlled LLM-based Reasoning for Clinical Trial Retrieval

Jullien, Mael, Bogatu, Alex, Unsworth, Harriet, Freitas, Andre

arXiv.org Artificial IntelligenceSep-19-2024

Matching patients to clinical trials demands a systematic and reasoned interpretation of documents which require significant expert-level background knowledge, over a complex set of well-defined eligibility criteria. Moreover, this interpretation process needs to operate at scale, over vast knowledge bases of trials. In this paper, we propose a scalable method that extends the capabilities of LLMs in the direction of systematizing the reasoning over sets of medical eligibility criteria, evaluating it in the context of real-world cases. The proposed method overlays a Set-guided reasoning method for LLMs. The proposed framework is evaluated on TREC 2022 Clinical Trials, achieving results superior to the state-of-the-art: NDCG@10 of 0.693 and Precision@10 of 0.73.

criteria, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2409.18998

Country:

North America > United States (0.14)
Europe > United Kingdom > England > Greater Manchester > Manchester (0.04)
Europe > Switzerland (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Rheumatology (1.00)
Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
(16 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

MedCalc-Bench: Evaluating Large Language Models for Medical Calculations

Khandekar, Nikhil, Jin, Qiao, Xiong, Guangzhi, Dunn, Soren, Applebaum, Serina S, Anwar, Zain, Sarfo-Gyamfi, Maame, Safranek, Conrad W, Anwar, Abid A, Zhang, Andrew, Gilson, Aidan, Singer, Maxwell B, Dave, Amisha, Taylor, Andrew, Zhang, Aidong, Chen, Qingyu, Lu, Zhiyong

arXiv.org Artificial IntelligenceJun-30-2024

As opposed to evaluating computation and logic-based reasoning, current benchmarks for evaluating large language models (LLMs) in medicine are primarily focused on question-answering involving domain knowledge and descriptive reasoning. While such qualitative capabilities are vital to medical diagnosis, in real-world scenarios, doctors frequently use clinical calculators that follow quantitative equations and rule-based reasoning paradigms for evidence-based decision support. To this end, we propose MedCalc-Bench, a first-of-its-kind dataset focused on evaluating the medical calculation capability of LLMs. MedCalc-Bench contains an evaluation set of over 1000 manually reviewed instances from 55 different medical calculation tasks. Each instance in MedCalc-Bench consists of a patient note, a question requesting to compute a specific medical value, a ground truth answer, and a step-by-step explanation showing how the answer is obtained. While our evaluation results show the potential of LLMs in this area, none of them are effective enough for clinical settings. Common issues include extracting the incorrect entities, not using the correct equation or rules for a calculation task, or incorrectly performing the arithmetic for the computation. We hope our study highlights the quantitative knowledge and reasoning gaps in LLMs within medical settings, encouraging future improvements of LLMs for various clinical calculation tasks.

calculator, dataset, patient note, (13 more...)

arXiv.org Artificial Intelligence

2406.12036

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > Virginia (0.04)
North America > United States > Illinois > Champaign County > Urbana (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Nephrology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
(8 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

Adversarial Attacks on Large Language Models in Medicine

Yang, Yifan, Jin, Qiao, Huang, Furong, Lu, Zhiyong

arXiv.org Artificial IntelligenceJun-18-2024

The integration of Large Language Models (LLMs) into healthcare applications offers promising advancements in medical diagnostics, treatment recommendations, and patient care. However, the susceptibility of LLMs to adversarial attacks poses a significant threat, potentially leading to harmful outcomes in delicate medical contexts. This study investigates the vulnerability of LLMs to two types of adversarial attacks in three medical tasks. Utilizing real-world patient data, we demonstrate that both open-source and proprietary LLMs are susceptible to manipulation across multiple tasks. This research further reveals that domain-specific tasks demand more adversarial data in model fine-tuning than general domain tasks for effective attack execution, especially for more capable models. We discover that while integrating adversarial data does not markedly degrade overall model performance on medical benchmarks, it does lead to noticeable shifts in fine-tuned model weights, suggesting a potential pathway for detecting and countering model attacks. This research highlights the urgent need for robust security measures and the development of defensive mechanisms to safeguard LLMs in medical applications, to ensure their safe and effective deployment in healthcare settings.

adversarial data, adversarial sample, fine-tuning, (16 more...)

arXiv.org Artificial Intelligence

2406.12259

Country:

North America > United States > Maryland > Prince George's County > College Park (0.04)
North America > United States > Maryland > Montgomery County > Bethesda (0.04)
Asia > Middle East > Israel (0.04)
Asia > China > Hong Kong (0.04)

Genre:

Research Report > New Finding (0.94)
Research Report > Experimental Study (0.68)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.76)

Add feedback

Aloe: A Family of Fine-tuned Open Healthcare LLMs

Gururajan, Ashwin Kumar, Lopez-Cuena, Enrique, Bayarri-Planas, Jordi, Tormos, Adrian, Hinjos, Daniel, Bernabeu-Perez, Pablo, Arias-Duart, Anna, Martin-Torres, Pablo Agustin, Urcelay-Ganzabal, Lucia, Gonzalez-Mallo, Marta, Alvarez-Napagao, Sergio, Ayguadé-Parra, Eduard, Garcia-Gasulla, Ulises Cortés Dario

arXiv.org Artificial IntelligenceMay-3-2024

As the capabilities of Large Language Models (LLMs) in healthcare and medicine continue to advance, there is a growing need for competitive open-source models that can safeguard public interest. With the increasing availability of highly competitive open base models, the impact of continued pre-training is increasingly uncertain. In this work, we explore the role of instruct tuning, model merging, alignment, red teaming and advanced inference schemes, as means to improve current open models. To that end, we introduce the Aloe family, a set of open medical LLMs highly competitive within its scale range. Aloe models are trained on the current best base models (Mistral, LLaMA 3), using a new custom dataset which combines public data sources improved with synthetic Chain of Thought (CoT). Aloe models undergo an alignment phase, becoming one of the first few policy-aligned open healthcare LLM using Direct Preference Optimization, setting a new standard for ethical performance in healthcare LLMs. Model evaluation expands to include various bias and toxicity datasets, a dedicated red teaming effort, and a much-needed risk assessment for healthcare LLMs. Finally, to explore the limits of current LLMs in inference, we study several advanced prompt engineering strategies to boost performance across benchmarks, yielding state-of-the-art results for open healthcare 7B LLMs, unprecedented at this scale.

dataset, information, patient note, (15 more...)

arXiv.org Artificial Intelligence

2405.01886

Country: