Collaborating Authors

AI in storytelling: Machines as cocreators


Sunspring debuted at the SCI-FI LONDON film festival in 2016. Set in a dystopian world with mass unemployment, the movie attracted many fans, with one viewer describing it as amusing but strange. But the most notable aspect of the film involves its creation: an artificial-intelligence (AI) bot wrote Sunspring's screenplay. "Maybe machines will replace human storytellers, just like self-driving cars could take over the roads." A closer look at Sunspring might raise some doubts, however.

VLEngagement: A Dataset of Scientific Video Lectures for Evaluating Population-based Engagement Machine Learning

With the emergence of e-learning and personalised education, the production and distribution of digital educational resources have boomed. Video lectures have now become one of the primary modalities to impart knowledge to masses in the current digital age. The rapid creation of video lecture content challenges the currently established human-centred moderation and quality assurance pipeline, demanding for more efficient, scalable and automatic solutions for managing learning resources. Although a few datasets related to engagement with educational videos exist, there is still an important need for data and research aimed at understanding learner engagement with scientific video lectures. This paper introduces VLEngagement, a novel dataset that consists of content-based and video-specific features extracted from publicly available scientific video lectures and several metrics related to user engagement. We introduce several novel tasks related to predicting and understanding context-agnostic engagement in video lectures, providing preliminary baselines. This is the largest and most diverse publicly available dataset to our knowledge that deals with such tasks. The extraction of Wikipedia topic-based features also allows associating more sophisticated Wikipedia based features to the dataset to improve the performance in these tasks. The dataset, helper tools and example code snippets are available publicly at

Predicting Engagement in Video Lectures Artificial Intelligence

The explosion of Open Educational Resources (OERs) in the recent years creates the demand for scalable, automatic approaches to process and evaluate OERs, with the end goal of identifying and recommending the most suitable educational materials for learners. We focus on building models to find the characteristics and features involved in context-agnostic engagement (i.e. population-based), a seldom researched topic compared to other contextualised and personalised approaches that focus more on individual learner engagement. Learner engagement, is arguably a more reliable measure than popularity/number of views, is more abundant than user ratings and has also been shown to be a crucial component in achieving learning outcomes. In this work, we explore the idea of building a predictive model for population-based engagement in education. We introduce a novel, large dataset of video lectures for predicting context-agnostic engagement and propose both cross-modal and modality-specific feature sets to achieve this task. We further test different strategies for quantifying learner engagement signals. We demonstrate the use of our approach in the case of data scarcity. Additionally, we perform a sensitivity analysis of the best performing model, which shows promising performance and can be easily integrated into an educational recommender system for OERs.



Recently, I rediscovered a TED Talk by David McCandless, a data journalist, called "The beauty of data visualization." It's a great reminder of how charts (though scary to many) can help you tell an actionable story about a topic in a way that bullet points alone usually cannot. If you have not seen the talk, I recommend you take a look for some inspiration about visualizing big ideas. In any social media report you make for the brass, there are several types of data charts to help summarize the performance of your social media channels; the most common ones are bar charts, pie/donut charts and line graphs. They are tried and true but often overused, and are not always the best way to visualize the data to then inform and justify your strategic decisions.