correct loss function
Supervised learning with probabilistic morphisms and kernel mean embeddings
In this paper I propose a generative model of supervised learning that unifies two approaches to supervised learning, using a concept of a correct loss function. Addressing two measurability problems, which have been ignored in statistical learning theory, I propose to use convergence in outer probability to characterize the consistency of a learning algorithm. Building upon these results, I extend a result due to Cucker-Smale, which addresses the learnability of a regression model, to the setting of a conditional probability estimation problem. Additionally, I present a variant of Vapnik-Stefanuyk's regularization method for solving stochastic ill-posed problems, and using it to prove the generalizability of overparameterized supervised learning models.
Role of choosing correct loss function
Readers of this blog already know what loss functions are in AI but for people starting into the field let me define it again. The loss function is a mathematical equation that all the deep learning algorithm tries to minimize or optimize. As we all know that Deep learning takes an iterative process to learn things, in every step, it calculates some metric that tells it how close it is to the original label and based upon that it optimizes its parameters. So the metrics that we minimize or optimize are called loss functions. There are a lot of famous loss functions like Mean square error, categorical cross-entropy, Dice loss, and many more.