Goto

Collaborating Authors

 variance-based regularization


Variance-based Regularization with Convex Objectives

Neural Information Processing Systems

We develop an approach to risk minimization and stochastic optimization that provides a convex surrogate for variance, allowing near-optimal and computationally efficient trading between approximation and estimation error. Our approach builds off of techniques for distributionally robust optimization and Owen's empirical likelihood, and we provide a number of finite-sample and asymptotic results characterizing the theoretical performance of the estimator. In particular, we show that our procedure comes with certificates of optimality, achieving (in some scenarios) faster rates of convergence than empirical risk minimization by virtue of automatically balancing bias and variance. We give corroborating empirical evidence showing that in practice, the estimator indeed trades between variance and absolute performance on a training sample, improving out-of-sample (test) performance over standard empirical risk minimization for a number of classification problems.


Reviews: Variance-based Regularization with Convex Objectives

Neural Information Processing Systems

The article "Variance-based Regularization with Convex Objectives" considers the problem of the risk minimization in a wide class of parametric models. The authors propose the convex surrogate for variance which allows for the near-optimal and computationally efficient estimation. The idea of the paper is to substitute the variance term in the upper bound for the risk of estimation with the convex surrogate based on the certain robust penalty. The authors prove that this surrogate provides good approximation to the variance term and prove certain upper bounds on the overall performance of the method. Moreover, the particular example is provided where the proposed method beats empirical risk minimization. The experimental part of the paper considers the comparison of empirical risk minimization with proposed robust method on 2 classification problems.


Variance-based Regularization with Convex Objectives

Hongseok Namkoong, John C. Duchi

Neural Information Processing Systems

We develop an approach to risk minimization and stochastic optimization that provides a convex surrogate for variance, allowing near-optimal and computationally efficient trading between approximation and estimation error. Our approach builds off of techniques for distributionally robust optimization and Owen's empirical likelihood, and we provide a number of finite-sample and asymptotic results characterizing the theoretical performance of the estimator. In particular, we show that our procedure comes with certificates of optimality, achieving (in some scenarios) faster rates of convergence than empirical risk minimization by virtue of automatically balancing bias and variance. We give corroborating empirical evidence showing that in practice, the estimator indeed trades between variance and absolute performance on a training sample, improving out-of-sample (test) performance over standard empirical risk minimization for a number of classification problems.


Variance-based Regularization with Convex Objectives

Namkoong, Hongseok, Duchi, John C.

Neural Information Processing Systems

We develop an approach to risk minimization and stochastic optimization that provides a convex surrogate for variance, allowing near-optimal and computationally efficient trading between approximation and estimation error. Our approach builds off of techniques for distributionally robust optimization and Owen's empirical likelihood, and we provide a number of finite-sample and asymptotic results characterizing the theoretical performance of the estimator. In particular, we show that our procedure comes with certificates of optimality, achieving (in some scenarios) faster rates of convergence than empirical risk minimization by virtue of automatically balancing bias and variance. We give corroborating empirical evidence showing that in practice, the estimator indeed trades between variance and absolute performance on a training sample, improving out-of-sample (test) performance over standard empirical risk minimization for a number of classification problems. Papers published at the Neural Information Processing Systems Conference.


Empirical Hypothesis Space Reduction

Yabe, Akihiro, Maehara, Takanori

arXiv.org Machine Learning

Selecting appropriate regularization coefficients is critical to performance with respect to regularized empirical risk minimization problems. Existing theoretical approaches attempt to determine the coefficients in order for regularized empirical objectives to be upper-bounds of true objectives, uniformly over a hypothesis space. Such an approach is, however, known to be over-conservative, especially in high-dimensional settings with large hypothesis space. In fact, an existing generalization error bound in variance-based regularization is $O(\sqrt{d \log n/n})$, where $d$ is the dimension of hypothesis space, and thus the number of samples required for convergence linearly increases with respect to $d$. This paper proposes an algorithm that calculates regularization coefficient, one which results in faster convergence of generalization error $O(\sqrt{\log n/n})$ and whose leading term is independent of the dimension $d$. This faster convergence without dependence on the size of the hypothesis space is achieved by means of empirical hypothesis space reduction, which, with high probability, successfully reduces a hypothesis space without losing the true optimum solution. Calculation of uniform upper bounds over reduced spaces, then, enables acceleration of the convergence of generalization error.


Variance-based Regularization with Convex Objectives

Namkoong, Hongseok, Duchi, John C.

Neural Information Processing Systems

We develop an approach to risk minimization and stochastic optimization that provides a convex surrogate for variance, allowing near-optimal and computationally efficient trading between approximation and estimation error. Our approach builds off of techniques for distributionally robust optimization and Owen's empirical likelihood, and we provide a number of finite-sample and asymptotic results characterizing the theoretical performance of the estimator. In particular, we show that our procedure comes with certificates of optimality, achieving (in some scenarios) faster rates of convergence than empirical risk minimization by virtue of automatically balancing bias and variance. We give corroborating empirical evidence showing that in practice, the estimator indeed trades between variance and absolute performance on a training sample, improving out-of-sample (test) performance over standard empirical risk minimization for a number of classification problems.