Multi-GranularityCross-modalAlignmentfor GeneralizedMedicalVisualRepresentationLearning (SupplementaryMaterial)

Feb-12-2026, 05:50:44 GMT–Neural Information Processing Systems

We use the open-source mimic-cxr repository4 to extract impression and findings for each report. Following [9], we pick out sequences of alphanumeric characters and drop all other characters and symbols for all reports, and remove reports which contain less than3 tokens. Following common practice in ViT [5], we split the radiograph with patch size16 16,which results in 196 visual tokens for each image. The instance-level projection layer is a two-layer MultiLayer Perceptron (MLP) with Batch Normalization [10] and ReLU activation function. Additionally, we use a frozen Batch Normalization layer after the MLP toobtain instance-levelembeddings.

artificial intelligence, machine learning, supplementarymaterial, (17 more...)

Neural Information Processing Systems

Feb-12-2026, 05:50:44 GMT

Conferences PDF

Add feedback

Industry:
- Health & Medicine
  - Nuclear Medicine (0.49)
  - Diagnostic Medicine > Imaging (0.49)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)

Duplicate Docs Excel Report

Title
d925bda407ada0df3190df323a212661-Supplemental-Conference.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found