Assistive robots can potentially improve the quality of life and personal independence of elderly people by supporting everyday life activities. To guarantee a safe and intuitive interaction between human and robot, human intentions need to be recognized automatically. As humans communicate their intentions multimodally, the use of multiple modalities for intention recognition may not just increase the robustness against failure of individual modalities but especially reduce the uncertainty about the intention to be predicted. This is desirable as particularly in direct interaction between robots and potentially vulnerable humans a minimal uncertainty about the situation as well as knowledge about this actual uncertainty is necessary. Thus, in contrast to existing methods, in this work a new approach for multimodal intention recognition is introduced that focuses on uncertainty reduction through classifier fusion. For the four considered modalities speech, gestures, gaze directions and scene objects individual intention classifiers are trained, all of which output a probability distribution over all possible intentions. By combining these output distributions using the Bayesian method Independent Opinion Pool the uncertainty about the intention to be recognized can be decreased. The approach is evaluated in a collaborative human-robot interaction task with a 7-DoF robot arm. The results show that fused classifiers which combine multiple modalities outperform the respective individual base classifiers with respect to increased accuracy, robustness, and reduced uncertainty.
As in any interaction process, misunderstandings, ambiguity, and failures to correctly understand the interaction partner are bound to happen in human-robot interaction. We term these failures 'conflicts' and are interested in both conflict detection and conflict resolution. In that, we focus on the robot's perspective. For the robot, conflicts may occur because of errors in its perceptual processes or because of ambiguity stemming from human input. This poster presents a brief system overview, and details Here, we briefly outline the project's motivation and setting, introduce the general processing framework, and then present two kinds of conflicts in some more detail: 1) a failure to identify a relevant object at all; 2) ambiguity emerging from multiple matches in scene perception.
As robots become more ubiquitous, it is increasingly important for untrained users to be able to interact with them intuitively. In this work, we investigate how people refer to objects in the world during relatively unstructured communication with robots. We collect a corpus of deictic interactions from users describing objects, which we use to train language and gesture models that allow our robot to determine what objects are being indicated. We introduce a temporal extension to state-of-the-art hierarchical matching pursuit features to support gesture understanding, and demonstrate that combining multiple communication modalities more effectively captures user intent than relying on a single type of input. Finally, we present initial interactions with a robot that uses the learned models to follow commands while continuing to learn from user input.
Faces play a magnificent role in human robot interaction, as they do in our daily life. The inherent ability of the human mind facilitates us to recognize a person by exploiting various challenges such as bad illumination, occlusions, pose variation etc. which are involved in face recognition. But it is a very complex task in nature to identify a human face by humanoid robots. The recent literatures on face biometric recognition are extremely rich in its application on structured environment for solving human identification problem. But the application of face biometric on mobile robotics is limited for its inability to produce accurate identification in uneven circumstances. The existing face recognition problem has been tackled with our proposed component based fragmented face recognition framework. The proposed framework uses only a subset of the full face such as eyes, nose and mouth to recognize a person. It's less searching cost, encouraging accuracy and ability to handle various challenges of face recognition offers its applicability on humanoid robots. The second problem in face recognition is the face spoofing, in which a face recognition system is not able to distinguish between a person and an imposter (photo/video of the genuine user). The problem will become more detrimental when robots are used as an authenticator. A depth analysis method has been investigated in our research work to test the liveness of imposters to discriminate them from the legitimate users. The implication of the previous earned techniques has been used with respect to criminal identification with NAO robot. An eyewitness can interact with NAO through a user interface. NAO asks several questions about the suspect, such as age, height, her/his facial shape and size etc., and then making a guess about her/his face.
Abstract-- Human-robot interactions are less efficient and communicative than human-to-human interactions, and a key reason is a lack of informed sense of touch in robotic systems. Existing literature demonstrates robot success in executing handovers with humans, albeit with substantial reliance on external sensing or with primitive signal processing methods, deficient compared to the rich set of information humans can detect. Simply grasping an object with acceptable grip forces A. Motivation and Applications Some examples of HRI handovers require the human to apply An ultimate goal of the robotics sector of Human-Robot aggressive forces for the robot to detect when to release a Interaction (HRI) is to enable interactions so natural and transferred object , and others implement thresholding of efficient as to reach parity with human-human interactions. A different approach that Attainment of this lofty goal can be significantly advanced results in markedly "humanlike" HRI handovers requires a by improving a robotic system's keen sense of touch, a sense fixed, calibrated external camera , a reliance that cannot which humans seamlessly and continuously use to perform be feasibly implemented on mobile platforms or in dynamic any physical task. The incorporation of machine learning tools to assist robotic systems with interpreting contact with their environments There are many works in the existing literature that make has led to more sophisticated improvements.