AITopics | yolo-s

Collaborating Authors

yolo-s

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Appendix

Neural Information Processing SystemsFeb-11-2026, 11:37:20 GMT

"PE-cls PE-det"referstoperforming2D interpolation ofImageNet-1k pre-trained PE-cls toPE-det for object detection. The PEs added in the intermediate (Mid.) Weconcludethat: For a given YOLOS model, different self-attention heads focus on different patterns & differentlocations. We study the attention map differences of two YOLOS models,i.e., the 200 epochs ImageNet-1k [4] pre-trained YOLOS-S and the300 epochs ImageNet-1k pre-trained YOLOS-S. Note that the AP of these two models is the same (AP= 36.1).

artificial intelligence, cocoap, yolo-s, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Vision (0.37)

Add feedback

Appendix

Neural Information Processing SystemsAug-17-2025, 21:22:51 GMT

In object detection and many other computer vision benchmarks, the image resolutions as well as the aspect ratios are usually not fixed as the image classification task. For the first layer, the PE is interpolated following ViT. In a word, Type-I uses more PEs and Type-II uses larger PE. In our paper, small-and base-sized models use this setting. The detailed configurations are given in Tab. 1. PE-cls to PE-det Rand.

artificial intelligence, epoch pre-trained, yolo-s, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Vision (1.00)

Add feedback

You Only Look at One Sequence: Rethinking Transformer in Vision through Object Detection

Fang, Yuxin, Liao, Bencheng, Wang, Xinggang, Fang, Jiemin, Qi, Jiyang, Wu, Rui, Niu, Jianwei, Liu, Wenyu

arXiv.org Artificial IntelligenceJun-1-2021

Can Transformer perform $2\mathrm{D}$ object-level recognition from a pure sequence-to-sequence perspective with minimal knowledge about the $2\mathrm{D}$ spatial structure? To answer this question, we present You Only Look at One Sequence (YOLOS), a series of object detection models based on the na\"ive Vision Transformer with the fewest possible modifications as well as inductive biases. We find that YOLOS pre-trained on the mid-sized ImageNet-$1k$ dataset only can already achieve competitive object detection performance on COCO, \textit{e.g.}, YOLOS-Base directly adopted from BERT-Base can achieve $42.0$ box AP. We also discuss the impacts as well as limitations of current pre-train schemes and model scaling strategies for Transformer in vision through object detection. Code and model weights are available at \url{https://github.com/hustvl/YOLOS}.

arxiv preprint arxiv, detection, transformer, (14 more...)

arXiv.org Artificial Intelligence

2106.00666

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback