Yahoo! Research
Viral Actions: Predicting Video View Counts Using Synchronous Sharing Behaviors
Shamma, David A. (Yahoo! Research) | Yew, Jude (University of Michigan) | Kennedy, Lyndon (Yahoo! Research) | Churchill, Elizabeth F. (Yahoo! Research)
In this article, we present a method for predicting the view count of a YouTube video using a small feature set collected from a synchronous sharing tool. We hypothesize that videos which have a high YouTube view count will exhibit a unique sharing pattern when shared in synchronous environments. Using a one-day sample of 2,188 dyadic sessions from the Yahoo! Zync synchronous sharing tool, we demonstrate how to predict the video's view count on YouTube, specifically if a video has over 10 million views. The prediction model is 95.8% accurate and done with a relatively small training set; only 15% of the videos had more than one session viewing; in effect, the classifier had a precision of 76.4% and a recall of 81%. We describe a prediction model that relies on using implicit social shared viewing behavior such as how many times a video was paused, rewound, or fast-forwarded as well as the duration of the session. Finally, we present some new directions for future virality research and for the design of future social media tools.
Designing Markets for Prediction
Chen, Yiling (Harvard University) | Pennock, David M. (Yahoo! Research)
In this article, we survey a number of mechanisms created to elicit predictions, many newly proposed within the last decade. We focus on the engineering questions: How do they work and why? What factors and goals are most important in their design? The primary goal of a prediction mechanism is to obtain and aggregate dispersed information, which often exists in tacit forms as beliefs, opinions, or judgements of agents. Coalescing information is a necessary first step for decision making in almost all domains. For example, consider seasonal influenza, a significant cause of illness and death around the world. Although it recurs every year, the geographic location, timing, magnitude, and duration of outbreaks vary widely. Many people possess relevant pieces of the full information puzzle, including doctors who meet patients, clinical microbiologists who perform respiratory culture tests, pharmacists who fill prescriptions, people who have the flu, and people who know people who have the flu.
Prioritization of Domain-Specific Web Information Extraction
Huang, Jian (Pennsylvania State University) | Yu, Cong (Yahoo! Research)
It is often desirable to extract structured information from raw web pages for better information browsing, query answering, and pattern mining. many such Information Extraction (IE) technologies are costly and applying them at the web-scale is impractical. In this paper, we propose a novel prioritization approach where candidate pages from the corpus are ordered according to their expected contribution to the extraction results and those with higher estimated potential are extracted earlier. Systems employing this approach can stop the extraction process at any time when the resource gets scarce (i.e., not all pages in the corpus can be processed), without worrying about wasting extraction effort on unimportant pages. More specifically, we define a novel notion to measure the value of extraction results and design various mechanisms for estimating a candidate page’s contribution to this value. We further design and build the Extraction Prioritization (EP) system with efficient scoring and scheduling algorithms, and experimentally demonstrate that EP significantly outperforms the naive approach and is more flexible than the classifier approach.
AAAI 2008 Workshop Reports
Anand, Sarabjot Singh (University of Warwick) | Bunescu, Razvan C. (Ohio University) | Carvalho, Vitor R. (Microsoft Live Labs) | Chomicki, Jan (University of Buffalo) | Conitzer, Vincent (Duke University) | Cox, Michael T. (BBN Technologies) | Dignum, Virginia (Utrecht University) | Dodds, Zachary (Harvey Mudd College) | Dredze, Mark (University of Pennsylvania) | Furcy, David (University of Wisconsin Oshkosh) | Gabrilovich, Evgeniy (Yahoo! Research) | Göker, Mehmet H. (PricewaterhouseCoopers) | Guesgen, Hans Werner (Massey University) | Hirsh, Haym (Rutgers University) | Jannach, Dietmar (Dortmund University of Technology) | Junker, Ulrich (ILOG) | Ketter, Wolfgang (Erasmus University) | Kobsa, Alfred (University of California, Irvine) | Koenig, Sven (University of Southern California) | Lau, Tessa (IBM Almaden Research Center) | Lewis, Lundy (Southern New Hampshire University) | Matson, Eric (Purdue University) | Metzler, Ted (Oklahoma City University) | Mihalcea, Rada (University of North Texas) | Mobasher, Bamshad (DePaul University) | Pineau, Joelle (McGill University) | Poupart, Pascal (University of Waterloo) | Raja, Anita (University of North Carolina at Charlotte) | Ruml, Wheeler (University of New Hampshire) | Sadeh, Norman M. (Carnegie Mellon University) | Shani, Guy (Microsoft Research) | Shapiro, Daniel (Applied Reactivity, Inc.) | Smith, Trey (Carnegie Mellon University West) | Taylor, Matthew E. (University of Southern California) | Wagstaff, Kiri (Jet Propulsion Laboratory) | Walsh, William (CombineNet) | Zhou, Ron (Palo Alto Research Center)
AAAI 2008 Workshop Reports
Anand, Sarabjot Singh (University of Warwick) | Bunescu, Razvan C. (Ohio University) | Carvalho, Vitor R. (Microsoft Live Labs) | Chomicki, Jan (University of Buffalo) | Conitzer, Vincent (Duke University) | Cox, Michael T. (BBN Technologies) | Dignum, Virginia (Utrecht University) | Dodds, Zachary (Harvey Mudd College) | Dredze, Mark (University of Pennsylvania) | Furcy, David (University of Wisconsin Oshkosh) | Gabrilovich, Evgeniy (Yahoo! Research) | Göker, Mehmet H. (PricewaterhouseCoopers) | Guesgen, Hans Werner (Massey University) | Hirsh, Haym (Rutgers University) | Jannach, Dietmar (Dortmund University of Technology) | Junker, Ulrich (ILOG) | Ketter, Wolfgang (Erasmus University) | Kobsa, Alfred (University of California, Irvine) | Koenig, Sven (University of Southern California) | Lau, Tessa (IBM Almaden Research Center) | Lewis, Lundy (Southern New Hampshire University) | Matson, Eric (Purdue University) | Metzler, Ted (Oklahoma City University) | Mihalcea, Rada (University of North Texas) | Mobasher, Bamshad (DePaul University) | Pineau, Joelle (McGill University) | Poupart, Pascal (University of Waterloo) | Raja, Anita (University of North Carolina at Charlotte) | Ruml, Wheeler (University of New Hampshire) | Sadeh, Norman M. (Carnegie Mellon University) | Shani, Guy (Microsoft Research) | Shapiro, Daniel (Applied Reactivity, Inc.) | Smith, Trey (Carnegie Mellon University West) | Taylor, Matthew E. (University of Southern California) | Wagstaff, Kiri (Jet Propulsion Laboratory) | Walsh, William (CombineNet) | Zhou, Ron (Palo Alto Research Center)
AAAI was pleased to present the AAAI-08 Workshop Program, held Sunday and Monday, July 13–14, in Chicago, Illinois, USA. The program included the following 15 workshops: Advancements in POMDP Solvers; AI Education Workshop Colloquium; Coordination, Organizations, Institutions, and Norms in Agent Systems, Enhanced Messaging; Human Implications of Human-Robot Interaction; Intelligent Techniques for Web Personalization and Recommender Systems; Metareasoning: Thinking about Thinking; Multidisciplinary Workshop on Advances in Preference Handling; Search in Artificial Intelligence and Robotics; Spatial and Temporal Reasoning; Trading Agent Design and Analysis; Transfer Learning for Complex Tasks; What Went Wrong and Why: Lessons from AI Research and Applications; and Wikipedia and Artificial Intelligence: An Evolving Synergy.