ccpe
Lla-VAP: LSTM Ensemble of Llama and VAP for Turn-Taking Prediction
Jeon, Hyunbae, Guintu, Frederic, Sahni, Rayvant
Turn-taking prediction is the task of anticipating when the speaker in a conversation will yield their turn to another speaker to begin speaking. This project expands on existing strategies for turn-taking prediction by employing a multi-modal ensemble approach that integrates large language models (LLMs) and voice activity projection (VAP) models. By combining the linguistic capabilities of LLMs with the temporal precision of VAP models, we aim to improve the accuracy and efficiency of identifying TRPs in both scripted and unscripted conversational scenarios. Our methods are evaluated on the In-Conversation Corpus (ICC) and Coached Conversational Preference Elicitation (CCPE) datasets, highlighting the strengths and limitations of current models while proposing a potentially more robust framework for enhanced prediction.
Software Engineering for Collective Cyber-Physical Ecosystems
Casadei, Roberto, Aguzzi, Gianluca, Audrito, Giorgio, Damiani, Ferruccio, Pianini, Danilo, Scarso, Giordano, Torta, Gianluca, Viroli, Mirko
Today's distributed and pervasive computing addresses large-scale cyber-physical ecosystems, characterised by dense and large networks of devices capable of computation, communication and interaction with the environment and people. While most research focusses on treating these systems as "composites" (i.e., heterogeneous functional complexes), recent developments in fields such as self-organising systems and swarm robotics have opened up a complementary perspective: treating systems as "collectives" (i.e., uniform, collaborative, and self-organising groups of entities). This article explores the motivations, state of the art, and implications of this "collective computing paradigm" in software engineering, discusses its peculiar challenges, and outlines a path for future research, touching on aspects such as macroprogramming, collective intelligence, self-adaptive middleware, learning, synthesis, and experimentation of collective behaviour.
Google open-sources datasets for AI assistants with human-level understanding
Both datasets are being shared by Google AI researchers to supply the training material necessary to model natural language systems that achieve human-level performance. Google researchers call CCPE a new way to collect voice data. It includes 500 dialogues with people about their movie preferences -- 10,000 in total, across 12,000 utterances. Movie preferences were chosen as a topic because of the value of metadata such as the names of actors and directors. "We do not restrict the workers to detailed scripts or to a small knowledge base and hence we observe that our dataset contains more realistic and diverse conversations in comparison to existing datasets," a paper published covering CCPE reads.