Goto

Collaborating Authors

 ar headset


Augmented-Reality Enabled Crop Monitoring with Robot Assistance

arXiv.org Artificial Intelligence

The integration of augmented reality (AR), extended reality (XR), and virtual reality (VR) technologies in agriculture has shown significant promise in enhancing various agricultural practices. Mobile robots have also been adopted as assessment tools in precision agriculture, improving economic efficiency and productivity, and minimizing undesired effects such as weeds and pests. Despite considerable work on both fronts, the combination of a versatile User Interface (UI) provided by an AR headset with the integration and direct interaction and control of a mobile field robot has not yet been fully explored or standardized. This work aims to address this gap by providing real-time data input and control output of a mobile robot for precision agriculture through a virtual environment enabled by an AR headset interface. The system leverages open-source computational tools and off-the-shelf hardware for effective integration. Distinctive case studies are presented where growers or technicians can interact with a legged robot via an AR headset and a UI. Users can teleoperate the robot to gather information in an area of interest, request real-time graphed status of an area, or have the robot autonomously navigate to selected areas for measurement updates. The proposed system utilizes a custom local navigation method with a fixed holographic coordinate system in combination with QR codes. This step toward fusing AR and robotics in agriculture aims to provide practical solutions for real-time data management and control enabled by human-robot interaction. The implementation can be extended to various robot applications in agriculture and beyond, promoting a unified framework for on-demand and autonomous robot operation in the field.


Methodology to Deploy CNN-Based Computer Vision Models on Immersive Wearable Devices

arXiv.org Artificial Intelligence

Convolutional Neural Network (CNN) models often lack the ability to incorporate human input, which can be addressed by Augmented Reality (AR) headsets. However, current AR headsets face limitations in processing power, which has prevented researchers from performing real-time, complex image recognition tasks using CNNs in AR headsets. This paper presents a method to deploy CNN models on AR headsets by training them on computers and transferring the optimized weight matrices to the headset. The approach transforms the image data and CNN layers into a one-dimensional format suitable for the AR platform. We demonstrate this method by training the LeNet-5 CNN model on the MNIST dataset using PyTorch and deploying it on a HoloLens AR headset. The results show that the model maintains an accuracy of approximately 98%, similar to its performance on a computer. This integration of CNN and AR enables real-time image processing on AR headsets, allowing for the incorporation of human input into AI models.


Shared Affordance-awareness via Augmented Reality for Proactive Assistance in Human-robot Collaboration

arXiv.org Artificial Intelligence

Enabling humans and robots to collaborate effectively requires purposeful communication and an understanding of each other's affordances. Prior work in human-robot collaboration has incorporated knowledge of human affordances, i.e., their action possibilities in the current context, into autonomous robot decision-making. This "affordance awareness" is especially promising for service robots that need to know when and how to assist a person that cannot independently complete a task. However, robots still fall short in performing many common tasks autonomously. In this work-in-progress paper, we propose an augmented reality (AR) framework that bridges the gap in an assistive robot's capabilities by actively engaging with a human through a shared affordance-awareness representation. Leveraging the different perspectives from a human wearing an AR headset and a robot's equipped sensors, we can build a perceptual representation of the shared environment and model regions of respective agent affordances. The AR interface can also allow both agents to communicate affordances with one another, as well as prompt for assistance when attempting to perform an action outside their affordance region. This paper presents the main components of the proposed framework and discusses its potential through a domestic cleaning task experiment.


Multimodal Grounding for Embodied AI via Augmented Reality Headsets for Natural Language Driven Task Planning

arXiv.org Artificial Intelligence

Recent advances in generative modeling have spurred a resurgence in the field of Embodied Artificial Intelligence (EAI). EAI systems typically deploy large language models to physical systems capable of interacting with their environment. In our exploration of EAI for industrial domains, we successfully demonstrate the feasibility of co-located, human-robot teaming. Specifically, we construct an experiment where an Augmented Reality (AR) headset mediates information exchange between an EAI agent and human operator for a variety of inspection tasks. To our knowledge the use of an AR headset for multimodal grounding and the application of EAI to industrial tasks are novel contributions within Embodied AI research. In addition, we highlight potential pitfalls in EAI's construction by providing quantitative and qualitative analysis on prompt robustness.


AR Point&Click: An Interface for Setting Robot Navigation Goals

arXiv.org Artificial Intelligence

This paper considers the problem of designating navigation goal locations for interactive mobile robots. We investigate a point-andclick interface, implemented with an Augmented Reality (AR) headset. The cameras on the AR headset are used to detect natural pointing gestures performed by the user. The selected goal is visualized through the AR headset, allowing the users to adjust the goal location if desired. We conduct a user study in which participants set consecutive navigation goals for the robot using three different interfaces: AR Point&Click, Person Following and Tablet (birdeye map view). Results show that the proposed AR Point&Click interface improved the perceived accuracy, efficiency and reduced mental load compared to the baseline tablet interface, and it performed on-par to the Person Following method. These results show that the AR Point&Click is a feasible interaction model for setting navigation goals.


Direction-Aware Adaptive Online Neural Speech Enhancement with an Augmented Reality Headset in Real Noisy Conversational Environments

arXiv.org Artificial Intelligence

This paper describes the practical response- and performance-aware development of online speech enhancement for an augmented reality (AR) headset that helps a user understand conversations made in real noisy echoic environments (e.g., cocktail party). One may use a state-of-the-art blind source separation method called fast multichannel nonnegative matrix factorization (FastMNMF) that works well in various environments thanks to its unsupervised nature. Its heavy computational cost, however, prevents its application to real-time processing. In contrast, a supervised beamforming method that uses a deep neural network (DNN) for estimating spatial information of speech and noise readily fits real-time processing, but suffers from drastic performance degradation in mismatched conditions. Given such complementary characteristics, we propose a dual-process robust online speech enhancement method based on DNN-based beamforming with FastMNMF-guided adaptation. FastMNMF (back end) is performed in a mini-batch style and the noisy and enhanced speech pairs are used together with the original parallel training data for updating the direction-aware DNN (front end) with backpropagation at a computationally-allowable interval. This method is used with a blind dereverberation method called weighted prediction error (WPE) for transcribing the noisy reverberant speech of a speaker, which can be detected from video or selected by a user's hand gesture or eye gaze, in a streaming manner and spatially showing the transcriptions with an AR technique. Our experiment showed that the word error rate was improved by more than 10 points with the run-time adaptation using only twelve minutes of observation.


Augmented Reality Is Coming -- to Your Car's Windshield

#artificialintelligence

Like millions of other kids around the world, Jamieson Christmas, now in his mid-forties, was transfixed the first time he saw director George Lucas' epic space opera Star Wars. "I'm a child of the '70s," he told Digital Trends. "I grew up when Star Wars was first released. George Lucas set up this vision of little robots beaming three-dimensional pictures of people. It had a really tremendous influence on me."


How to use human and artificial intelligence with digital twins

#artificialintelligence

Human intelligence has been creating and maintaining complex systems since the beginnings of civilizations. In modern times, digital twins have emerged to aid operations of complex systems, as well as improve design and production. Artificial intelligence (AI) and extended reality (XR) – including augmented reality (AR) and virtual reality (VR) – have emerged as tools that can help manage operations for complex systems. Digital twins can be enhanced with AI and emerging user interface (UI) technologies like XR can improve people's abilities to manage complex systems via digital twins. Digital twins can marry human and AI to produce something far greater by creating a usable representation of complex systems. End users do not need to worry about the formulas that go into machine learning (ML), predictive modeling and artificially intelligent systems, but also can capitalize on their power as an extension of their own knowledge and abilities. Digital twins combined with AR, VR and related technologies provide a framework to overlay intelligent decision making into day-to-day operations, as shown in Figure 1. Figure 1: A digital twin can be enhanced with artificial intelligence (AI) and intelligent realities user interfaces, such as extended reality (XR), which includes augmented reality (AR) and virtual reality (VR). The operations of a physical twin can be digitized by sensors, cameras and other such devices, but those digital streams are not the only sources of data that can feed the digital twin. In addition to streaming data, accumulated historical data can inform a digital twin. Relevant data could include data not generated from the asset itself, such as weather and business cycle data. Also, computer-aided design (CAD) drawings and other documentation can help the digital twin provide context.


3d Scanner App

#artificialintelligence

I've been testing 3D scanning hardware and apps for years and this is one of the fastest and easiest ones that I've used. It's great to see how far the technology has come in both quality and cost reduction. I bought the 2020 iPad pro specifically for the new lidar sensor and applications like this. If you already have an iPad with this sensor (or hopefully iPhone when they come out), definitely download this app and try it out - it's free! The mesh is generated quickly thanks to Apple's lidar sensor and this app shows the triangles in realtime as they are generated and refined.


Augmented Reality: Leading AR companies named

#artificialintelligence

Augmented Reality is still developing as a technology, but is beginning to move into the mainstream. The big tech companies are scrambling to build sustainable AR ecosystems to gain early foothold in the potentially lucrative market, while specialist firms are focusing on areas like content development. In 2018, Alibaba, the Chinese ecommerce giant launched Taobao Buy, an app that aims to make online shopping more interactive. The app, accessible via Microsoft's HoloLens headsets, allows users to browse and interact for a select range of products from Alibaba's online store. Alibaba acquired Infinity AR and has also invested in Augmented Reality companies like WayRay and Magic Leap.