584b98aac2dddf59ee2cf19ca4ccb75e-Supplemental.pdf

Apr-26-2026, 00:56:45 GMT–Neural Information Processing Systems

We used the largest batch size that could fit in memory on our limited hardware, which was 256 for an image size of 224x224. For the learning rate (Adam [2] optimizer) we searched in the range of {0.001, 0.0001, 1e04, 5e-4, 5e-5}, with weight decay {0, 5e-4. We chose a weight decay of 5e-5 and learning rate of 5e-4 until the 4:6 split and 1e-4 afterwards. We chose a prototype dimension of 256, backbone output of 512, 2 graph layers, graph hidden dimension of 512, λh of 10, Clst and Sep of 0.01. UT-Zappos we again used the Adam optimizer, with learning rate in the ranges {5e-5, 5e-4, 5e-3}, and weight decay {0, 5e-4.

artificial intelligence, dataset, machine learning, (17 more...)

Neural Information Processing Systems

Apr-26-2026, 00:56:45 GMT

Conferences PDF

Add feedback

Country:
- North America > Canada (0.14)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (1.00)

Duplicate Docs Excel Report

Title
584b98aac2dddf59ee2cf19ca4ccb75e-Supplemental.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found