Counterfactual Inference for Consumer Choice Across Many Product Categories Machine Learning

This paper proposes a method for estimating consumer preferences among discrete choices, where the consumer chooses at most one product in a category, but selects from multiple categories in parallel. The consumer's utility is additive in the different categories. Her preferences about product attributes as well as her price sensitivity vary across products and are in general correlated across products. We build on techniques from the machine learning literature on probabilistic models of matrix factorization, extending the methods to account for time-varying product attributes and products going out of stock. We evaluate the performance of the model using held-out data from weeks with price changes or out of stock products. We show that our model improves over traditional modeling approaches that consider each category in isolation. One source of the improvement is the ability of the model to accurately estimate heterogeneity in preferences (by pooling information across categories); another source of improvement is its ability to estimate the preferences of consumers who have rarely or never made a purchase in a given category in the training data. Using held-out data, we show that our model can accurately distinguish which consumers are most price sensitive to a given product. We consider counterfactuals such as personally targeted price discounts, showing that using a richer model such as the one we propose substantially increases the benefits of personalization in discounts.

Feature Detection and Attenuation in Embeddings Machine Learning

Embedding is one of the fundamental building blocks for data analysis tasks. Although most embedding schemes are designed to be domain-specific, they have been recently extended to represent various other research domains. However, there are relatively few discussions on analyzing these generated embeddings, and removing undesired features from the embedding. In this paper, we first propose an innovative embedding analyzing method that quantitatively measures the features in the embedding data. We then propose an unsupervised method to remove or alleviate undesired features in the embedding by applying Domain Adversarial Network (DAN). Our empirical results demonstrate that the proposed algorithm has good performance on both industry and natural language processing benchmark datasets.

Sales forecasting and risk management under uncertainty in the media industry Machine Learning

In this work we propose a data-driven modelization approach for the management of advertising investments of a firm. First, we propose an application of dynamic linear models to the prediction of an economic variable, such as global sales, which can use information from the environment and the investment levels of the company in different channels. After we build a robust and precise model, we propose a metric of risk, which can help the firm to manage their advertisement plans, thus leading to a robust, risk-aware optimization of their revenue.

SHOPPER: A Probabilistic Model of Consumer Choice with Substitutes and Complements Machine Learning

We develop SHOPPER, a sequential probabilistic model of market baskets. SHOPPER uses interpretable components to model the forces that drive how a customer chooses products; in particular, we designed SHOPPER to capture how items interact with other items. We develop an efficient posterior inference algorithm to estimate these forces from large-scale data, and we analyze a large dataset from a major chain grocery store. We are interested in answering counterfactual queries about changes in prices. We found that SHOPPER provides accurate predictions even under price interventions, and that it helps identify complementary and substitutable pairs of products.

Transfer Topic Modeling with Ease and Scalability Machine Learning

The increasing volume of short texts generated on social media sites, such as Twitter or Facebook, creates a great demand for effective and efficient topic modeling approaches. While latent Dirichlet allocation (LDA) can be applied, it is not optimal due to its weakness in handling short texts with fast-changing topics and scalability concerns. In this paper, we propose a transfer learning approach that utilizes abundant labeled documents from other domains (such as Yahoo! News or Wikipedia) to improve topic modeling, with better model fitting and result interpretation. Specifically, we develop Transfer Hierarchical LDA (thLDA) model, which incorporates the label information from other domains via informative priors. In addition, we develop a parallel implementation of our model for large-scale applications. We demonstrate the effectiveness of our thLDA model on both a microblogging dataset and standard text collections including AP and RCV1 datasets.