Perceptrons
Multi-GranularityCross-modalAlignmentfor GeneralizedMedicalVisualRepresentationLearning (SupplementaryMaterial)
We use the open-source mimic-cxr repository4 to extract impression and findings for each report. Following [9], we pick out sequences of alphanumeric characters and drop all other characters and symbols for all reports, and remove reports which contain less than3 tokens. Following common practice in ViT [5], we split the radiograph with patch size16 16,which results in 196 visual tokens for each image. The instance-level projection layer is a two-layer MultiLayer Perceptron (MLP) with Batch Normalization [10] and ReLU activation function. Additionally, we use a frozen Batch Normalization layer after the MLP toobtain instance-levelembeddings.
ConE: ConeEmbeddingsforMulti-HopReasoning overKnowledgeGraphs Appendix
Figure 1: Fourteen queries used in the experiments. They do not contain personally identifiable information or offensive content. All the models are implemented in Pytorch [5] and based on the official implementation of BETAE [6]2 for a fair comparison. Forall the modules using multi-layer perceptron (MLP), we use a three-layer MLP with 1600 hidden neurons and ReLU activation. We apply dropout to the min function inCardMin and search the dropout rate in{0.05,0.10,0.15,0.20}.