A Co-Training Semi-Supervised Framework Using Faster R-CNN and YOLO Networks for Object Detection in Densely Packed Retail Images

Yazdanjouei, Hossein, Mansouri, Arash, Shokouhifar, Mohammad

arXiv.org Artificial Intelligence 

Abstract: This study proposes a semi-supervised co-training framework for object detection in densely packed retail environments, where limited labe led data and complex conditions pose major challenges. The framework combines Faster R-CNN (utilizing a ResNe t backbone) for precise localiza tion with YOLO (employing a Darknet backbone) for global context, enabling mutual pseudo-label exchange that impr oves accuracy in scenes with occlusion and overlapping objects. To strengthe n classification, it employs a n ensemble of XGBoost, Random Forest, and SVM, utilizing diverse feature representations for higher robustness . Hyperparameters are optimized using a metaheuristic-driven algorithm, enhancing precision and efficiency across mod els. By minimizing relianc e on manual labeling, the approach reduces annotation costs and adapts effectively to fre quent product and layout changes common in retail. Experiments on the SKU-110k datase t demonstrate strong performa nce, highlighting the scal ability and practicality of the proposed framework for real-world retail applications such as automated inventory tracking, product monitoring, and checkout systems. Keywords: Retail object detection; Densely packed scenes; Semi-supervised learning; Co-training method; Faster R-CNN; Metaheuristic optim ization; YOLO integration. Detecting objects in densely pack ed retail environments has bec ome essential due to the increasing demand for automation in inventory management, product recognition, and ef ficient checkout processes in modern retail.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found