Beyond Views: Measuring and Predicting Engagement in Online Videos

AAAI Conferences

The share of videos in the internet traffic has been growing, therefore understanding how videos capture attention on a global scale is also of growing importance. Most current research focus on modeling the number of views, but we argue that video engagement, or time spent watching is a more appropriate measure for resource allocation problems in attention, networking, and promotion activities. In this paper, we present a first large-scale measurement of video-level aggregate engagement from publicly available data streams, on a collection of 5.3 million YouTube videos published over two months in 2016. We study a set of metrics including time and the average percentage of a video watched. We define a new metric, relative engagement, that is calibrated against video properties and strongly correlate with recognized notions of quality. Moreover, we find that engagement measures of a video are stable over time, thus separating the concerns for modeling engagement and those for popularity -- the latter is known to be unstable over time and driven by external promotions. We also find engagement metrics predictable from a cold-start setup, having most of its variance explained by video context, topics and channel information -- R2=0.77. Our observations imply several prospective uses of engagement metrics -- choosing engaging topics for video production, or promoting engaging videos in recommender systems.


Viral Actions: Predicting Video View Counts Using Synchronous Sharing Behaviors

AAAI Conferences

In this article, we present a method for predicting the view count of a YouTube video using a small feature set collected from a synchronous sharing tool. We hypothesize that videos which have a high YouTube view count will exhibit a unique sharing pattern when shared in synchronous environments. Using a one-day sample of 2,188 dyadic sessions from the Yahoo! Zync synchronous sharing tool, we demonstrate how to predict the video's view count on YouTube, specifically if a video has over 10 million views. The prediction model is 95.8% accurate and done with a relatively small training set; only 15% of the videos had more than one session viewing; in effect, the classifier had a precision of 76.4% and a recall of 81%. We describe a prediction model that relies on using implicit social shared viewing behavior such as how many times a video was paused, rewound, or fast-forwarded as well as the duration of the session. Finally, we present some new directions for future virality research and for the design of future social media tools.


Voices of Vlogging

AAAI Conferences

Vlogs have rapidly evolved from the ’chat from your bedroom’ format to a highly creative form of expression and communication. However, despite the high popularity of vlogging, automatic analysis of conversational vlogs have not been attempted in the literature. In this paper, we present a novel analysis of conversational vlogs based on the characterization of vloggers’ nonverbal behavior. We investigate the use of four nonverbal cues extracted automatically from the audio channel to measure the behavior of vloggers and explore the relation to their degree of popularity and that of their videos. Our study is validated on over 2200 videos and 150 hours of data, and shows that one nonverbal cue (speaking time) is correlated with levels of popularity with a medium size effect.


Scalable Social Analytics for Live Viral Event Prediction

AAAI Conferences

Large-scale, predictive social analytics have proven effective. Over the last decade, research and industrial efforts have understood the potential value of inferences based on online behavior analysis, sentiment mining, influence analysis, epidemic spread, etc. The majority of these efforts, however, are not yet designed with realtime responsiveness as a first-order requirement. Typical systems perform a post-mortem analysis on volumes of historical data and validate their “predictions” against already-occurred events.We observe that in many applications, real-time predictions are critical and delays of hours (and even minutes) can reduce their utility. As examples: political campaigns could react very quickly to a scandal spreading on Facebook; content distribution networks (CDNs) could prefetch videos that are predicted to soon go viral; online advertisement campaigns can be corrected to enhance consumer reception. This paper proposes CrowdCast, a cloud-based framework to enable real-time analysis and prediction from streaming social data. As an instantiation of this framework, we tune CrowdCast to observe Twitter tweets, and predict which YouTube videos are most likely to “go viral” in the near future. To this end, CrowdCast first applies online machine learning to map natural language tweets to a specific YouTube video. Then, tweets that indeed refer to videos are weighted by the perceived “influence” of the sender. Finally, the video’s spread is predicted through a sociological model, derived from the emerging structure of the graph over which the video-related tweets are (still) spreading. Combining metrics of influence and live structure, CrowdCast outputs sets of candidate videos, identified as likely to become viral in the next few hours. We monitor Twitter for more than 30 days, and find that CrowdCast’s real-time predictions demonstrate encouraging correlation with actual YouTube viewership in the near future.


The Lifecyle of a Youtube Video: Phases, Content and Popularity

AAAI Conferences

This paper proposes a new representation to explain and predict popularity evolution in social media. Recent work on social networks has led to insights about the popularity of a digital item. For example, both the content and the network matters, and gaining early popularity is critical. However, these observations did not paint a full picture of popularity evolution; some open questions include: what kind of popularity trends exist among different types of videos, and will an unpopular video become popular? To this end, we propose a novel phase representation that extends the well-known endogenous growth and exogenous shock model (Crane and Sornette 2008). We further propose efficient algorithms to simultaneously estimate and segment power-law shaped phases from historical popularity data. With the extracted phases, we found that videos go through not one, but multiple stages of popularity increase or decrease over many months. On a dataset containing the 2-year history of over 172,000 YouTube videos, we observe that phases are directly related to content type and popularity change, e.g., nearly 3/4 of the top 5% popular videos have 3 or more phases, more than 60% news videos are dominated by one long power-law decay, and 75% of videos that made a significant jump to become the most popular videos have been in increasing phases. Finally, we leverage this phase representation to predict future viewcount gain and found that using phase information reduces the average prediction error over the state-of-the-art for videos of all phase shapes.