Support vector machine (SVM) is one of the most widely used classification methods. In this paper, we consider soft margin support vector machine used on data points with independent features, where the sample size $n$ and the feature dimension $p$ grows to $\infty$ in a fixed ratio $p/n\rightarrow \delta$. We propose a set of equations that exactly characterizes the asymptotic behavior of support vector machine. In particular, we give exact formula for (1) the variability of the optimal coefficients, (2) proportion of data points lying on the margin boundary (i.e. number of support vectors), (3) the final objective function value, and (4) expected misclassification error on new data points, which in particular implies exact formula for the optimal tuning parameter given a data generating mechanism. The global null case is considered first, where the label $y\in\{+1,-1\}$ is independent of the feature $x$. Then the signaled case is considered, where the label $y\in\{+1,-1\}$ is allowed to have a general dependence on the feature $x$ through a linear combination $a_0^Tx$. These results for the non-smooth hinge loss serve as an analogue to the recent results in \citet{sur2018modern} for smooth logistic loss. Our approach is based on heuristic leave-one-out calculations.

Feng, Guanhao, Polson, Nicholas, Wang, Yuexi, Xu, Jianeng

Sparse alpha-norm regularization has many data-rich applications in Marketing and Economics. Alpha-norm, in contrast to lasso and ridge regularization, jumps to a sparse solution. This feature is attractive for ultra high-dimensional problems that occur in demand estimation and forecasting. The alpha-norm objective is nonconvex and requires coordinate descent and proximal operators to find the sparse solution. We study a typical marketing demand forecasting problem, grocery store sales for salty snacks, that has many dummy variables as controls. The key predictors of demand include price, equivalized volume, promotion, flavor, scent, and brand effects. By comparing with many commonly used machine learning methods, alpha-norm regularization achieves its goal of providing accurate out-of-sample estimates for the promotion lift effects. Finally, we conclude with directions for future research.

Kolouri, Soheil, Martin, Charles E., Rohde, Gustavo K.

In this paper we study generative modeling via autoencoders while using the elegant geometric properties of the optimal transport (OT) problem and the Wasserstein distances. We introduce Sliced-Wasserstein Autoencoders (SWAE), which are generative models that enable one to shape the distribution of the latent space into any samplable probability distribution without the need for training an adversarial network or defining a closed-form for the distribution. In short, we regularize the autoencoder loss with the sliced-Wasserstein distance between the distribution of the encoded training samples and a predefined samplable distribution. We show that the proposed formulation has an efficient numerical solution that provides similar capabilities to Wasserstein Autoencoders (WAE) and Variational Autoencoders (VAE), while benefiting from an embarrassingly simple implementation.

Dong, Xiao, Wu, Jiasong, Zhou, Ling

The astonishing success of AlphaGo Zero\cite{Silver_AlphaGo} invokes a worldwide discussion of the future of our human society with a mixed mood of hope, anxiousness, excitement and fear. We try to dymystify AlphaGo Zero by a qualitative analysis to indicate that AlphaGo Zero can be understood as a specially structured GAN system which is expected to possess an inherent good convergence property. Thus we deduct the success of AlphaGo Zero may not be a sign of a new generation of AI.