Goto

Collaborating Authors

 retail environment


RoboBenchMart: Benchmarking Robots in Retail Environment

arXiv.org Artificial Intelligence

Most existing robotic manipulation benchmarks focus on simplified tabletop scenarios, typically involving a stationary robotic arm interacting with various objects on a flat surface. To address this limitation, we introduce RoboBench-Mart, a more challenging and realistic benchmark designed for dark store environments, where robots must perform complex manipulation tasks with diverse grocery items. This setting presents significant challenges, including dense object clutter and varied spatial configurations -- with items positioned at different heights, depths, and in close proximity. By targeting the retail domain, our benchmark addresses a setting with strong potential for near-term automation impact. We demonstrate that current state-of-the-art generalist models struggle to solve even common retail tasks. To support further research, we release the RoboBenchMart suite, which includes a procedural store layout generator, a trajectory generation pipeline, evaluation tools and fine-tuned baseline models.


A Co-Training Semi-Supervised Framework Using Faster R-CNN and YOLO Networks for Object Detection in Densely Packed Retail Images

arXiv.org Artificial Intelligence

Abstract: This study proposes a semi-supervised co-training framework for object detection in densely packed retail environments, where limited labe led data and complex conditions pose major challenges. The framework combines Faster R-CNN (utilizing a ResNe t backbone) for precise localiza tion with YOLO (employing a Darknet backbone) for global context, enabling mutual pseudo-label exchange that impr oves accuracy in scenes with occlusion and overlapping objects. To strengthe n classification, it employs a n ensemble of XGBoost, Random Forest, and SVM, utilizing diverse feature representations for higher robustness . Hyperparameters are optimized using a metaheuristic-driven algorithm, enhancing precision and efficiency across mod els. By minimizing relianc e on manual labeling, the approach reduces annotation costs and adapts effectively to fre quent product and layout changes common in retail. Experiments on the SKU-110k datase t demonstrate strong performa nce, highlighting the scal ability and practicality of the proposed framework for real-world retail applications such as automated inventory tracking, product monitoring, and checkout systems. Keywords: Retail object detection; Densely packed scenes; Semi-supervised learning; Co-training method; Faster R-CNN; Metaheuristic optim ization; YOLO integration. Detecting objects in densely pack ed retail environments has bec ome essential due to the increasing demand for automation in inventory management, product recognition, and ef ficient checkout processes in modern retail.


Visual Prompting for Robotic Manipulation with Annotation-Guided Pick-and-Place Using ACT

arXiv.org Artificial Intelligence

Embodied AI Research T eam National Institute of AIST Tokyo, Japan muha.muttaqien@aist.go.jp Embodied AI Research T eam National Institute of AIST Tokyo, Japan tomohiro.motoda@aist.go.jp Embodied AI Research T eam National Institute of AIST Tokyo, Japan ryo.hanai@aist.go.jp Abstract --Robotic pick-and-place tasks in convenience stores pose challenges due to dense object arrangements, occlusions, and variations in object properties such as color, shape, size, and texture. These factors complicate trajectory planning and grasping. This paper introduces a perception-action pipeline leveraging annotation-guided visual prompting, where bounding box annotations identify both pickable objects and placement locations, providing structured spatial guidance. Instead of traditional step-by-step planning, we employ Action Chunking with Transformers (ACT) as an imitation learning algorithm, enabling the robotic arm to predict chunked action sequences from human demonstrations. We evaluate our system based on success rate and visual analysis of grasping behavior, demonstrating improved grasp accuracy and adaptability in retail environments. Robotic pick-and-place tasks are essential in various industrial and retail applications, particularly in convenience stores where robots must handle a diverse range of products with different shapes, sizes, textures, and colors, as shown in Figure 1. However, real-world pick-and-place scenarios pose significant challenges due to dense object arrangements, frequent occlusions, and the need for precise grasping and placement.


Efficient Retail Video Annotation: A Robust Key Frame Generation Approach for Product and Customer Interaction Analysis

arXiv.org Artificial Intelligence

Accurate video annotation plays a vital role in modern retail applications, including customer behavior analysis, product interaction detection, and in-store activity recognition. However, conventional annotation methods heavily rely on time-consuming manual labeling by human annotators, introducing non-robust frame selection and increasing operational costs. To address these challenges in the retail domain, we propose a deep learning-based approach that automates key-frame identification in retail videos and provides automatic annotations of products and customers. Our method leverages deep neural networks to learn discriminative features by embedding video frames and incorporating object detection-based techniques tailored for retail environments. Experimental results showcase the superiority of our approach over traditional methods, achieving accuracy comparable to human annotator labeling while enhancing the overall efficiency of retail video annotation. Remarkably, our approach leads to an average of 2 times cost savings in video annotation. By allowing human annotators to verify/adjust less than 5% of detected frames in the video dataset, while automating the annotation process for the remaining frames without reducing annotation quality, retailers can significantly reduce operational costs. The automation of key-frame detection enables substantial time and effort savings in retail video labeling tasks, proving highly valuable for diverse retail applications such as shopper journey analysis, product interaction detection, and in-store security monitoring.


Major UK retailer brings in ROBOTS to undertake a 'crucial' supermarket task

Daily Mail - Science & tech

Morrisons has unveiled its newest staff members - in the form of aisle-roaming robots. The retail giant is trialling'Tally' robots at three stores in Wetherby, Redcar and Stockton, to monitor how products are being displayed on shelves. Using advanced AI and computer vision technology, Tally is designed to spot out-of-stock items, pricing errors, and misplaced products. Morrisons' technology manager, Katherine Allanach, called this a'crucial' role. 'It is a crucial but time-consuming task and so Tally aims to allow more time for colleagues to focus on customer service,' she told The Grocer.


Exploring Pose-Based Anomaly Detection for Retail Security: A Real-World Shoplifting Dataset and Benchmark

arXiv.org Artificial Intelligence

Shoplifting poses a significant challenge for retailers, resulting in billions of dollars in annual losses. Traditional security measures often fall short, highlighting the need for intelligent solutions capable of detecting shoplifting behaviors in real time. This paper frames shoplifting detection as an anomaly detection problem, focusing on the identification of deviations from typical shopping patterns. We introduce PoseLift, a privacy-preserving dataset specifically designed for shoplifting detection, addressing challenges such as data scarcity, privacy concerns, and model biases. PoseLift is built in collaboration with a retail store and contains anonymized human pose data from real-world scenarios. By preserving essential behavioral information while anonymizing identities, PoseLift balances privacy and utility. We benchmark state-of-the-art pose-based anomaly detection models on this dataset, evaluating performance using a comprehensive set of metrics. Our results demonstrate that pose-based approaches achieve high detection accuracy while effectively addressing privacy and bias concerns inherent in traditional methods. As one of the first datasets capturing real-world shoplifting behaviors, PoseLift offers researchers a valuable tool to advance computer vision ethically and will be publicly available to foster innovation and collaboration. The dataset is available at https://github.com/TeCSAR-UNCC/PoseLift.


Enabling Data-Driven and Empathetic Interactions: A Context-Aware 3D Virtual Agent in Mixed Reality for Enhanced Financial Customer Experience

arXiv.org Artificial Intelligence

In this paper, we introduce a novel system designed to enhance customer service in the financial and retail sectors through a context-aware 3D virtual agent, utilizing Mixed Reality (MR) and Vision Language Models (VLMs). Our approach focuses on enabling data-driven and empathetic interactions that ensure customer satisfaction by introducing situational awareness of the physical location, personalized interactions based on customer profiles, and rigorous privacy and security standards. We discuss our design considerations critical for deployment in real-world customer service environments, addressing challenges in user data management and sensitive information handling. We also outline the system architecture and key features unique to banking and retail environments. Our work demonstrates the potential of integrating MR and VLMs in service industries, offering practical insights in customer service delivery while maintaining high standards of security and personalization.


Long-Term, Store-Front Robotics: Interactive Music for Robotic Arm, Caxixi and Frame Drums

arXiv.org Artificial Intelligence

This paper presents an innovative exploration into the integration of interactive robotic musicianship within a commercial retail environment, specifically through a three-week-long in-store installation featuring a UR3 robotic arm, custom-built frame drums, and an adaptive music generation system. Situated in a prominent storefront in one of the world's largest cities, this project aimed to enhance the shopping experience by creating dynamic, engaging musical interactions that respond to the store's ambient soundscape. Key contributions include the novel application of industrial robotics in artistic expression, the deployment of interactive music to enrich retail ambiance, and the demonstration of continuous robotic operation in a public setting over an extended period. Challenges such as system reliability, variation in musical output, safety in interactive contexts, and brand alignment were addressed to ensure the installation's success. The project not only showcased the technical feasibility and artistic potential of robotic musicianship in retail spaces but also offered insights into the practical implications of such integration, including system reliability, the dynamics of human-robot interaction, and the impact on store operations. This exploration opens new avenues for enhancing consumer retail experiences through the intersection of technology, music, and interactive art, suggesting a future where robotic musicianship contributes meaningfully to public and commercial spaces.


Podcast 12: Real world tech: Edge AI drives car-making, healthcare and retail - VanillaPlus - The global voice of Telecoms IT

#artificialintelligence

Artificial intelligence (AI) at the edge is changing healthcare, retail and Audi cars, as Intel's IoT Group vice president, John Healy tells Jeremy Cowan and George Malim. Plus we learn how chipmakers globally are tackling supply problems that have halted vehicle production. The semiconductor industry is facing an "awakening", says Healy, as it shape-shifts to meet "insatiable demand" for silicone. Finally, we hear which African country is a leader in satellite cartography, and how Amazon is playing games with its warehouse staff. Hi, and welcome to the latest Trending Tech Podcast brought to you by The Evolving Enterprise, IoT Now, and VanillaPlus.com. This is Jeremy Cowan, and I want to thank you for joining the latest, sometimes serious, sometimes light-hearted look at enterprise digital transformation. I am delighted to welcome today two guests, who are John Healy, from California-based international technology company, Intel, known among other things, for the processors that power so many of our devices. John is vice president of the IoT Group. John, thank you very much for making the time to be here. Good to have you on again, George. Okay, today, we'll be looking at some key tech news stories that deserve a bit of a deeper dive.


The enormous impact of AI and machine learning on e-commerce

#artificialintelligence

Australians love online shopping and continuously look for a great purchase experience each time they check out. As with any mature market, customers are spoiled for choice, and competition is fierce. After all, the competitor is just a click away. In a volatile retail environment, the value of a satisfied, loyal customer cannot be underestimated. Customer experience (CX) is now the new arena where customers are won or lost.