AITopics

Free trial promotions, where users are given a limited time to try the product for free, are a commonly used customer acquisition strategy in the Software as a Service (SaaS) industry. We examine how trial length affect users' responsiveness, and seek to quantify the gains from personalizing the length of the free trial promotions. Our data come from a large-scale field experiment conducted by a leading SaaS firm, where new users were randomly assigned to 7, 14, or 30 days of free trial. First, we show that the 7-day trial to all consumers is the best uniform policy, with a 5.59% increase in subscriptions. Next, we develop a three-pronged framework for personalized policy design and evaluation. Using our framework, we develop seven personalized targeting policies based on linear regression, lasso, CART, random forest, XGBoost, causal tree, and causal forest, and evaluate their performances using the Inverse Propensity Score (IPS) estimator. We find that the personalized policy based on lasso performs the best, followed by the one based on XGBoost. In contrast, policies based on causal tree and causal forest perform poorly. We then link a method's effectiveness in designing policy with its ability to personalize the treatment sufficiently without over-fitting (i.e., capture spurious heterogeneity). Next, we segment consumers based on their optimal trial length and derive some substantive insights on the drivers of user behavior in this context. Finally, we show that policies designed to maximize short-run conversions also perform well on long-run outcomes such as consumer loyalty and profitability.

artificial intelligence, machine learning, treatment effect, (19 more...)

2006.1342

Country:

Europe > Germany (0.04)
Asia > Japan (0.04)
Oceania > New Zealand (0.04)
(5 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Research Report > Strength High (0.86)

Industry: Information Technology > Software (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Scharwächter, Erik, Lennartz, Jonathan, Müller, Emmanuel

Differentiable Segmentation of Sequences

Segmented models are widely used to describe non-stationary sequential data with discrete change points. Their estimation usually requires solving a mixed discrete-continuous optimization problem, where the segmentation is the discrete part and all other model parameters are continuous. A number of estimation algorithms have been developed that are highly specialized for their specific model assumptions. The dependence on non-standard algorithms makes it hard to integrate segmented models in state-of-the-art deep learning architectures that critically depend on gradient-based optimization techniques. In this work, we formulate a relaxed variant of segmented models that enables joint estimation of all model parameters, including the segmentation, with gradient descent. We build on recent advances in learning continuous warping functions and propose a novel family of warping functions based on the two-sided power (TSP) distribution. TSP-based warping functions are differentiable, have simple closed-form expressions, and can represent segmentation functions exactly. Our formulation includes the important class of segmented generalized linear models as a special case, which makes it highly versatile. We use our approach to model the spread of COVID-19 by segmented Poisson regression, perform logistic regression on Fashion-MNIST with artificial concept drift, and demonstrate its capacities for phoneme segmentation.

artificial intelligence, change point, machine learning, (17 more...)

2006.13105

Country:

Asia > Singapore (0.14)
Europe > Germany > North Rhine-Westphalia > Cologne Region > Bonn (0.04)
North America > United States > New York (0.04)
(4 more...)

Genre:

Research Report > Experimental Study (0.66)
Research Report > New Finding (0.48)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.49)
Health & Medicine > Therapeutic Area > Immunology (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Molinari, Roberto, Bakalli, Gaetan, Guerrier, Stéphane, Miglioli, Cesare, Orso, Samuel, Scaillet, Olivier

SWAG: A Wrapper Method for Sparse Learning

Predictive power has always been the main research focus of learning algorithms. While the general approach for these algorithms is to consider all possible attributes in a dataset to best predict the response of interest, an important branch of research is focused on sparse learning. Indeed, in many practical settings we believe that only an extremely small combination of different attributes affect the response. However even sparse-learning methods can still preserve a high number of attributes in high-dimensional settings and possibly deliver inconsistent prediction performance. The latter methods can also be hard to interpret for researchers and practitioners, a problem which is even more relevant for the ``black-box''-type mechanisms of many learning approaches. Finally, there is often a problem of replicability since not all data-collection procedures measure (or observe) the same attributes and therefore cannot make use of proposed learners for testing purposes. To address all the previous issues, we propose to study a procedure that combines screening and wrapper methods and aims to find a library of extremely low-dimensional attribute combinations (with consequent low data collection and storage costs) in order to (i) match or improve the predictive performance of any particular learning method which uses all attributes as an input (including sparse learners); (ii) provide a low-dimensional network of attributes easily interpretable by researchers and practitioners; and (iii) increase the potential replicability of results due to a diversity of attribute combinations defining strong learners with equivalent predictive power. We call this algorithm ``Sparse Wrapper AlGorithm'' (SWAG).

artificial intelligence, learner, machine learning, (16 more...)

2006.12837

Genre: Research Report > New Finding (0.48)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Chzhen, Evgenii, Denis, Christophe, Hebiri, Mohamed, Oneto, Luca, Pontil, Massimiliano

Fair Regression with Wasserstein Barycenters

We study the problem of learning a real-valued function that satisfies the Demographic Parity constraint. It demands the distribution of the predicted output to be independent of the sensitive attribute. We consider the case that the sensitive attribute is available for prediction. We establish a connection between fair regression and optimal transport theory, based on which we derive a close form expression for the optimal fair predictor. Specifically, we show that the distribution of this optimum is the Wasserstein barycenter of the distributions induced by the standard regression function on the sensitive groups. This result offers an intuitive interpretation of the optimal fair prediction and suggests a simple post-processing algorithm to achieve fairness. We establish risk and distribution-free fairness guarantees for this procedure. Numerical experiments indicate that our method is very effective in learning fair models, with a relative increase in error rate that is inferior to the relative gain in fairness.

artificial intelligence, estimator, machine learning, (14 more...)

2006.07286

Country:

North America > United States (0.28)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Industry:

Education > Educational Setting (0.68)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Barik, Adarsh, Honorio, Jean

Exact Support Recovery in Federated Regression with One-shot Communication

arXiv.org Machine LearningJun-22-2020

Federated learning provides a framework to address the challenges of distributed computing, data ownership and privacy over a large number of distributed clients with low computational and communication capabilities. In this paper, we study the problem of learning the exact support of sparse linear regression in the federated learning setup. We provide a simple communication efficient algorithm which only needs one-shot communication with the centralized server to compute the exact support. Our method does not require the clients to solve any optimization problem and thus, can be run on devices with low computational capabilities. Our method is naturally robust to the problems of client failure, model poisoning and straggling clients. We formally prove that our method requires a number of samples per client that is polynomial with respect to the support size, but independent of the dimension of the problem. We require the number of distributed clients to be logarithmic in the dimension of the problem. If the predictor variables are mutually independent then the overall sample complexity matches the optimal sample complexity of the non-federated centralized setting. Furthermore, our method is easy to implement and has an overall polynomial time complexity.

federated learning, pp 1, predictor, (13 more...)

2006.12583

Country:

North America > United States > Virginia (0.04)
North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.35)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.34)

Zhao, Lingxiao, Akoglu, Leman

Connecting Graph Convolutional Networks and Graph-Regularized PCA

arXiv.org Machine LearningJun-22-2020

Graph convolution operator of the GCN model is originally motivated from a localized first-order approximation of spectral graph convolutions.This work stands on a different view; establishing a connection between graph convolution and graph-regularized PCA. Based on this connection, GCN architecture, shaped by stacking graph convolution layers, shares a close relationship with stacking graph-regularized PCA (GPCA). We empirically demonstrate that the unsupervised embeddings by GPCA paired with a logistic regression classifier achieves similar performance to GCN on semi-supervised node classification tasks. Further, we capitalize on the discovered relationship to design an effective initialization strategy for GCN based on stacking GPCA.

artificial intelligence, gpca, machine learning, (12 more...)

2006.12294

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
Europe > Austria > Vienna (0.14)

Genre: Research Report (0.90)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.88)

arXiv.org Machine LearningJun-21-2020

Electoral David vs Goliath: How does the Spatial Concentration of Electors affect District-based Elections?

Mitra, Adway

Many democratic countries use district-based elections where there is a "seat" for each district in the governing body. In each district, the party whose candidate gets the maximum number of votes wins the corresponding seat. The result of the election is decided based on the number of seats won by the different parties. The electors (voters) can cast their votes only in the district of their residence. Thus, locations of the electors and boundaries of the districts may severely affect the election result even if the proportion of popular support (number of electors) of different parties remains unchanged. This has led to significant amount of research on whether the districts may be redrawn or electors may be moved to maximize seats for a particular party. In this paper, we frame the spatial distribution of electors in a probabilistic setting, and explore different models to capture the intra-district polarization of electors in favour of a party, or the spatial concentration of supporters of different parties. Our models are inspired by elections in India, where supporters of different parties tend to be concentrated in certain districts. We show with extensive simulations that our model can capture different statistical properties of real elections held in India. We frame parameter estimation problems to fit our models to the observed election results. Since analytical calculation of the likelihood functions are infeasible for our complex models, we use Likelihood-free Inference methods under the Approximate Bayesian Computation framework. Since this approach is highly time-consuming, we explore how supervised regression using Logistic Regression or Deep Neural Networks can be used to speed it up. We also explore how the election results can change by varying the spatial distributions of the voters, even when the proportions of popular support of the parties remain constant.

artificial intelligence, elector, machine learning, (17 more...)

2006.11865

Country:

North America > United States (0.14)
Asia > India > West Bengal > Kharagpur (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Government > Voting & Elections (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)

Sarkar, Soumajyoti, Alvari, Hamidreza

Mitigating Bias in Online Microfinance Platforms: A Case Study on Kiva.org

arXiv.org Machine LearningJun-19-2020

Over the last couple of decades in the lending industry, financial disintermediation has occurred on a global scale. Traditionally, even for small supply of funds, banks would act as the conduit between the funds and the borrowers. It has now been possible to overcome some of the obstacles associated with such supply of funds with the advent of online platforms like Kiva, Prosper, LendingClub. Kiva for example, works with Micro Finance Institutions (MFIs) in developing countries to build Internet profiles of borrowers with a brief biography, loan requested, loan term, and purpose. Kiva, in particular, allows lenders to fund projects in different sectors through group or individual funding. Traditional research studies have investigated various factors behind lender preferences purely from the perspective of loan attributes and only until recently have some cross-country cultural preferences been investigated. In this paper, we investigate lender perceptions of economic factors of the borrower countries in relation to their preferences towards loans associated with different sectors. We find that the influence from economic factors and loan attributes can have substantially different roles to play for different sectors in achieving faster funding. We formally investigate and quantify the hidden biases prevalent in different loan sectors using recent tools from causal inference and regression models that rely on Bayesian variable selection methods. We then extend these models to incorporate fairness constraints based on our empirical analysis and find that such models can still achieve near comparable results with respect to baseline regression models.

borrower, funding time, regression model, (14 more...)

2006.12995

Country:

North America > United States > District of Columbia > Washington (0.05)
North America > United States > Arizona (0.04)
North America > United States > Indiana > Marion County > Indianapolis (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Banking & Finance > Loans (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.70)

Gupta, Kartik, Suggala, Arun Sai, Prasad, Adarsh, Netrapalli, Praneeth, Ravikumar, Pradeep

Learning Minimax Estimators via Online Learning

arXiv.org Machine LearningJun-19-2020

We consider the problem of designing minimax estimators for estimating the parameters of a probability distribution. Unlike classical approaches such as the MLE and minimum distance estimators, we consider an algorithmic approach for constructing such estimators. We view the problem of designing minimax estimators as finding a mixed strategy Nash equilibrium of a zero-sum game. By leveraging recent results in online learning with non-convex losses, we provide a general algorithm for finding a mixed-strategy Nash equilibrium of general non-convex non-concave zero-sum games. Our algorithm requires access to two subroutines: (a) one which outputs a Bayes estimator corresponding to a given prior probability distribution, and (b) one which computes the worst-case risk of any given estimator. Given access to these two subroutines, we show that our algorithm outputs both a minimax estimator and a least favorable prior. To demonstrate the power of this approach, we use it to construct provably minimax estimators for classical problems such as estimation in the finite Gaussian sequence model, and linear regression.

artificial intelligence, estimator, machine learning, (20 more...)

2006.1143

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York (0.04)
Asia > India (0.04)

Genre: Research Report (0.82)

Industry:

Education > Educational Setting > Online (0.60)
Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
(2 more...)

Konstantinov, Andrei V., Utkin, Lev V.

Gradient boosting machine with partially randomized decision trees

arXiv.org Machine LearningJun-19-2020

The gradient boosting machine is a powerful ensemble-based machine learning method for solving regression problems. However, one of the difficulties of its using is a possible discontinuity of the regression function, which arises when regions of training data are not densely covered by training points. In order to overcome this difficulty and to reduce the computational complexity of the gradient boosting machine, we propose to apply the partially randomized trees which can be regarded as a special case of the extremely randomized trees applied to the gradient boosting. The gradient boosting machine with the partially randomized trees is illustrated by means of many numerical examples using synthetic and real data.

artificial intelligence, machine learning, randomized tree, (18 more...)

2006.11014

Country:

Asia > Russia (0.14)
North America > United States > California (0.05)
Oceania > Australia > Victoria > Melbourne (0.04)
(4 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)