A Appendix

Neural Information Processing Systems 

Memory Cost of Self-attention Weights in DETR: DETR has six encoder-decoder pairs. The memory cost of this tensor during training under different hyper-parameter settings and optimization strategies are plotted in Figure 2. It shows that more attention For pedestrian detection tasks, we normally choose head=8, downsampling ratio=0.25 Thus, deformable DETR is used in our work to save memory resources. Note that the original image size is 1024x2048. The detection head is the same as CSP . Training: For anchor-free methods, the same ground truth and loss functions as CSP are utilized.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found