A Appendix

Aug-17-2025, 18:25:17 GMT–Neural Information Processing Systems

Memory Cost of Self-attention Weights in DETR: DETR has six encoder-decoder pairs. The memory cost of this tensor during training under different hyper-parameter settings and optimization strategies are plotted in Figure 2. It shows that more attention For pedestrian detection tasks, we normally choose head=8, downsampling ratio=0.25 Thus, deformable DETR is used in our work to save memory resources. Note that the original image size is 1024x2048. The detection head is the same as CSP . Training: For anchor-free methods, the same ground truth and loss functions as CSP are utilized.

artificial intelligence, attention weight, memory cost, (16 more...)

Neural Information Processing Systems

Aug-17-2025, 18:25:17 GMT

Conferences PDF

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Vision (0.30)

Duplicate Docs Excel Report

Title
afb8caec018d3c8f6ef8b81fa52386fe-Supplemental-Conference.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found