Stern School of Business, New York University
Large Scale Cross-Category Analysis of Consumer Review Content on Sales Conversion Leveraging Deep Learning
Liu, Xiao (Stern School of Business, New York University) | Lee, Dokyun (Carnegie Mellon University) | Srinivasan, Kannan (Carnegie Mellon University)
Consumers often rely on product reviews to make purchase decisions, but how consumers use review content in their decision making has remained a black box. In the past, extracting information from product reviews has been a labor-intensive process that has restricted studies on this topic to single product categories or those limited to summary statistics such as volume, valence, and ratings. This paper uses deep learning natural language processing techniques to overcome the limitations of manual information extraction and shed light into the black box of how consumers use review content. With the help of a comprehensive dataset that tracks individual-level review reading, search, as well as purchase behaviors on an e-commerce portal, we extract six quality and price content dimensions from over 500,000 reviews, covering nearly 600 product categories. The scale, scope, and precision of such a study would have been impractical using human coders or classical machine learning models. We achieve two objectives. First, we describe consumers’ review content reading behaviors. We find that although consumers do not read review content all the time, they do rely on review content for products that are expensive or of uncertain quality. Second, we quantify the causal impact of content information of read reviews on sales. We use a regression discontinuity in time design and leverage the variation in the review content seen by consumers due to newly added reviews. To extract content information, we develop two deep learning models: a full deep learning model that predicts conversion directly and a partial deep learning model that identifies content dimensions. Across both models, we find that aesthetics and price content in the reviews significantly affect conversion across almost all product categories. Review content information has a higher impact on sales when the average rating is higher and the variance of ratings is lower. Consumers depend more on review content when the market is more competitive or immature. A counterfactual simulation suggests that re-ordering reviews based on content can have the same effect as a 1.6% price cut for boosting conversion.
HodgeRank With Information Maximization for Crowdsourced Pairwise Ranking Aggregation
Xu, Qianqian (Institute of Information Engineering, CAS, Beijing) | Xiong, Jiechao (BICMR and School of Mathematical Sciences, Peking University, Beijing) | Chen, Xi (BICMR and School of Mathematical Sciences, Peking University, Beijing) | Huang, Qingming (Stern School of Business, New York University) | Yao, Yuan (University of Chinese Academy of Sciences, Beijing)
Recently, crowdsourcing has emerged as an effective paradigm for human-powered large scale problem solving in various domains. However, task requester usually has a limited amount of budget, thus it is desirable to have a policy to wisely allocate the budget to achieve better quality. In this paper, we study the principle of information maximization for active sampling strategies in the framework of HodgeRank, an approach based on Hodge Decomposition of pairwise ranking data with multiple workers. The principle exhibits two scenarios of active sampling: Fisher information maximization that leads to unsupervised sampling based on a sequential maximization of graph algebraic connectivity without considering labels; and Bayesian information maximization that selects samples with the largest information gain from prior to posterior, which gives a supervised sampling involving the labels collected. Experiments show that the proposed methods boost the sampling efficiency as compared to traditional sampling schemes and are thus valuable to practical crowdsourcing experiments.