48237d9f2dea8c74c2a72126cf63d933-Paper.pdf

Neural Information Processing Systems 

InComputerVision,however,almost all performant networks are "dense", that is, every input is processed by every parameter. We present a Vision MoE (V-MoE), a sparse version of the Vision Transformer, that is scalable and competitive with the largest dense networks.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found