MEDMAX: Mixed-Modal Instruction Tuning for Training Biomedical Assistants

Jun-20-2026, 11:13:34 GMT–Neural Information Processing Systems

Recent advancements in mixed-modal generative have opened new avenues for developing unified biomedical assistants capable of analyzing biomedical images, answering complex questions about them, and generating multimodal patient reports. However, existing datasets face challenges such as small sizes, limited coverage of biomedical tasks and domains, and a reliance on narrow sources. To address these gaps, we present MEDMAX, a large-scale multimodal biomedical instruction-tuning dataset for mixed-modal foundation models. With 1.47 million instances, MEDMAX encompasses a diverse range of tasks, including interleaved image-text generation, biomedical image captioning and generation, visual chat, and report understanding. These tasks span knowledge across diverse biomedical domains, including radiology and histopathology, grounded in medical papers and YouTube videos.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Jun-20-2026, 11:13:34 GMT

Conferences PDF

Add feedback

Country:
- North America > United States > California (0.28)

Genre:
- Instructional Material (0.68)
- Overview (0.67)
- Research Report
  - New Finding (1.00)
  - Experimental Study (0.67)

Industry:
- Health & Medicine
  - Nuclear Medicine (1.00)
  - Diagnostic Medicine > Imaging (1.00)
  - Pharmaceuticals & Biotechnology (0.93)
  - Therapeutic Area
    - Oncology (1.00)
    - Neurology (0.93)

Technology:
- Information Technology
  - Sensing and Signal Processing > Image Processing (1.00)
  - Artificial Intelligence
    - Vision (1.00)
    - Representation & Reasoning (0.93)
    - Natural Language
      - Large Language Model (1.00)
      - Chatbot (1.00)
    - Machine Learning > Neural Networks
      - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found