Goto

Collaborating Authors

 Varni, Giovanna


UpStory: the Uppsala Storytelling dataset

arXiv.org Artificial Intelligence

Friendship and rapport play an important role in the formation of constructive social interactions, and have been widely studied in educational settings due to their impact on student outcomes. Given the growing interest in automating the analysis of such phenomena through Machine Learning (ML), access to annotated interaction datasets is highly valuable. However, no dataset on dyadic child-child interactions explicitly capturing rapport currently exists. Moreover, despite advances in the automatic analysis of human behaviour, no previous work has addressed the prediction of rapport in child-child dyadic interactions in educational settings. We present UpStory -- the Uppsala Storytelling dataset: a novel dataset of naturalistic dyadic interactions between primary school aged children, with an experimental manipulation of rapport. Pairs of children aged 8-10 participate in a task-oriented activity: designing a story together, while being allowed free movement within the play area. We promote balanced collection of different levels of rapport by using a within-subjects design: self-reported friendships are used to pair each child twice, either minimizing or maximizing pair separation in the friendship network. The dataset contains data for 35 pairs, totalling 3h 40m of audio and video recordings. It includes two video sources covering the play area, as well as separate voice recordings for each child. An anonymized version of the dataset is made publicly available, containing per-frame head pose, body pose, and face features; as well as per-pair information, including the level of rapport. Finally, we provide ML baselines for the prediction of rapport.


Beam Search with Bidirectional Strategies for Neural Response Generation

arXiv.org Artificial Intelligence

Sequence-to-sequence neural networks have been widely used in language-based applications as they have flexible capabilities to learn various language models. However, when seeking for the optimal language response through trained neural networks, current existing approaches such as beam-search decoder strategies are still not able reaching to promising performances. Instead of developing various decoder strategies based on a "regular sentence order" neural network (a trained model by outputting sentences from left-to-right order), we leveraged "reverse" order as additional language model (a trained model by outputting sentences from right-to-left order) which can provide different perspectives for the path finding problems. In this paper, we propose bidirectional strategies in searching paths by combining two networks (left-to-right and right-to-left language models) making a bidirectional beam search possible. Besides, our solution allows us using any similarity measure in our sentence selection criterion. Our approaches demonstrate better performance compared to the unidirectional beam search strategy.


LOL — Laugh Out Loud

AAAI Conferences

Laughter is an important social signal which may have various communicative functions (Chapman 1983). Humans laugh at humorous stimuli or to mark their pleasure when receiving praised statements (Provine 2001); they also laugh to mask embarrassment (Huber and Ruch 2007) or to be cynical. Laughter can also act as social indicator of ingroup belonging (Adelswärd 1989); it can work as speech regulator during conversation (Provine 2001); it can also be used to elicit laughter in interlocutors as it is very contagious (Provine 2001). Endowing machines with laughter capabilities is a crucial challenge to develop virtual agents and robots able to act as companions, coaches, or supporters in a more natural manner. However, so far, few attempts have been made to model and implement laughter for virtual Figure 1: the architecture of our laughing agent.