Dogan, Fethiye Irmak
Streaming Network for Continual Learning of Object Relocations under Household Context Drifts
Bartoli, Ermanno, Dogan, Fethiye Irmak, Leite, Iolanda
In most applications, robots need to adapt to new environments and be multi-functional without forgetting previous information. This requirement gains further importance in real-world scenarios where robots operate in coexistence with humans. In these complex environments, human actions inevitably lead to changes, requiring robots to adapt accordingly. To effectively address these dynamics, the concept of continual learning proves essential. It not only enables learning models to integrate new knowledge while preserving existing information but also facilitates the acquisition of insights from diverse contexts. This aspect is particularly relevant to the issue of context-switching, where robots must navigate and adapt to changing situational dynamics. Our approach introduces a novel approach to effectively tackle the problem of context drifts by designing a Streaming Graph Neural Network that incorporates both regularization and rehearsal techniques. Our Continual\_GTM model enables us to retain previous knowledge from different contexts, and it is more effective than traditional fine-tuning approaches. We evaluated the efficacy of Continual\_GTM in predicting human routines within household environments, leveraging spatio-temporal object dynamics across diverse scenarios.
GRACE: Generating Socially Appropriate Robot Actions Leveraging LLMs and Human Explanations
Dogan, Fethiye Irmak, Ozyurt, Umut, Cinar, Gizem, Gunes, Hatice
When operating in human environments, robots need to handle complex tasks while both adhering to social norms and accommodating individual preferences. For instance, based on common sense knowledge, a household robot can predict that it should avoid vacuuming during a social gathering, but it may still be uncertain whether it should vacuum before or after having guests. In such cases, integrating common-sense knowledge with human preferences, often conveyed through human explanations, is fundamental yet a challenge for existing systems. In this paper, we introduce GRACE, a novel approach addressing this while generating socially appropriate robot actions. GRACE leverages common sense knowledge from Large Language Models (LLMs), and it integrates this knowledge with human explanations through a generative network architecture. The bidirectional structure of GRACE enables robots to refine and enhance LLM predictions by utilizing human explanations and makes robots capable of generating such explanations for human-specified actions. Our experimental evaluations show that integrating human explanations boosts GRACE's performance, where it outperforms several baselines and provides sensible explanations.
Semantically-Driven Disambiguation for Human-Robot Interaction
Dogan, Fethiye Irmak, Liu, Weiyu, Leite, Iolanda, Chernova, Sonia
Ambiguities are common in human-robot interaction, especially when a robot follows user instructions in a large collocated space. For instance, when the user asks the robot to find an object in a home environment, the object might be in several places depending on its varying semantic properties (e.g., a bowl can be in the kitchen cabinet or on the dining room table, depending on whether it is clean/dirty, full/empty and the other objects around it). Previous works on object semantics have predicted such relationships using one shot-inferences which are likely to fail for ambiguous or partially understood instructions. This paper focuses on this gap and suggests a semantically-driven disambiguation approach by utilizing follow-up clarifications to handle such uncertainties. To achieve this, we first obtain semantic knowledge embeddings, and then these embeddings are used to generate clarifying questions by following an iterative process. The evaluation of our method shows that our approach is model agnostic, i.e., applicable to different semantic embedding models, and follow-up clarifications improve the performance regardless of the embedding model. Additionally, our ablation studies show the significance of informative clarifications and iterative predictions to enhance system accuracies.