acet
Adaptive Class Emergence Training: Enhancing Neural Network Stability and Generalization through Progressive Target Evolution
Recent advancements in artificial intelligence, particularly deep neural networks, have pushed the boundaries of what is achievable in complex tasks. Traditional methods for training neural networks in classification problems often rely on static target outputs, such as one-hot encoded vectors, which can lead to unstable optimization and difficulties in handling non-linearities within data. In this paper, we propose a novel training methodology that progressively evolves the target outputs from a null vector to one-hot encoded vectors throughout the training process. This gradual transition allows the network to adapt more smoothly to the increasing complexity of the classification task, maintaining an equilibrium state that reduces the risk of overfitting and enhances generalization. Our approach, inspired by concepts from structural equilibrium in finite element analysis, has been validated through extensive experiments on both synthetic and real-world datasets. The results demonstrate that our method achieves faster convergence, improved accuracy, and better generalization, especially in scenarios with high data complexity and noise. This progressive training framework offers a robust alternative to classical methods, opening new perspectives for more efficient and stable neural network training.
- Europe > France > Hauts-de-France > Oise > Compiègne (0.04)
- Africa > Middle East > Morocco (0.04)
Why ReLU networks yield high-confidence predictions far away from the training data and how to mitigate the problem
Hein, Matthias, Andriushchenko, Maksym, Bitterwolf, Julian
Classifiers used in the wild, in particular for safety-critical systems, should not only have good generalization properties but also should know when they don't know, in particular make low confidence predictions far away from the training data. We show that ReLU type neural networks which yield a piecewise linear classifier function fail in this regard as they produce almost always high confidence predictions far away from the training data. For bounded domains like images we propose a new robust optimization technique similar to adversarial training which enforces low confidence predictions far away from the training data. We show that this technique is surprisingly effective in reducing the confidence of predictions far away from the training data while maintaining high confidence predictions and similar test error on the original classification task compared to standard training.
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
- Europe > Germany > Saarland (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.70)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)