Collaborating Authors

Regularized deep learning with a non-convex penalty Machine Learning

Regularization methods are often employed in deep learning neural networks (DNNs) to prevent overfitting. For penalty based methods for DNN regularization, typically only convex penalties are considered because of their optimization guarantees. Recent theoretical work have shown that non-convex penalties that satisfy certain regularity conditions are also guaranteed to perform well with standard optimization algorithms. In this paper, we examine new and currently existing non-convex penalties for DNN regularization. We provide theoretical justifications for the new penalties and also assess the performance of all penalties on DNN analysis of real datasets. Introduction The success of DNNs in learning complex relationships between the inputs and outputs may be mainly attributed to multiple nonlinear hidden layers [1,2]. Corresponding author, address: 350 Community Drive, Manhasset, NY 11030. Such large number of parameters gives the method incredible amount of flexibility. However on the downside, this may lead to overfitting the data, especially if the training sample is not large enough.



This blogpost will help you to understand why regularization is important in training the Machine Learning models, and also why it is most talked about topic in ML domain. So, lets look at this plot. What are the things we are deciphering from this? In this graph x-axis is the time taken for iterations and y-axis is the loss on training and test data. Can you notice anything wrong here?

Over-fitting and Regularization – Towards Data Science – Medium


In supervised machine learning, models are trained on a subset of data aka training data. The goal is to compute the target of each training example from the training data. Now, overfitting happens when model learns signal as well as noise in the training data and wouldn't perform well on new data on which model wasn't trained on. In the example below, you can see underfitting in first few steps and overfitting in last few. Now, there are few ways you can avoid overfitting your model on training data like cross-validation sampling, reducing number of features, pruning, regularization etc. Regularization basically adds the penalty as model complexity increases.

Hands-on with Feature Selection Techniques: Embedded Methods


Embedded methods complete the feature selection process within the construction of the machine learning algorithm itself. In other words, they perform feature selection during the model training, which is why we call them embedded methods. A learning algorithm takes advantage of its own variable selection process and performs feature selection and classification/regression at the same time. The embedded method solves both issues we encountered with the filter and wrapper methods by combining their advantages. In this article, we'll explore a few specific methods that use embedded feature selection: regularization and tree-based methods.

A Unified Framework for Constructing Nonconvex Regularizations Machine Learning

Over the past decades, many individual nonconvex methods have been proposed to achieve better sparse recovery performance in various scenarios. However, how to construct a valid nonconvex regularization function remains open in practice. In this paper, we fill in this gap by presenting a unified framework for constructing the nonconvex regularization based on the probability density function. Meanwhile, a new nonconvex sparse recovery method constructed via the Weibull distribution is studied. Sparse recovery has attracted tremendous research interest in various areas including statistical learning [1] and compressive sensing [2]. The author is with the Department of Statistics, Zhejiang University City College, 310015, Hangzhou, China (e-mail: