AITopics | train neural network

Constrained Optimization to Train Neural Networks on Critical and Under-Represented Classes

Neural Information Processing SystemsDec-24-2025, 23:40:54 GMT

Deep neural networks (DNNs) are notorious for making more mistakes for the classes that have substantially fewer samples than the others during training. Such class imbalance is ubiquitous in clinical applications and very crucial to handle because the classes with fewer samples most often correspond to critical cases (e.g., cancer) where misclassifications can have severe consequences.Not to miss such cases, binary classifiers need to be operated at high True Positive Rates (TPRs) by setting a higher threshold, but this comes at the cost of very high False Positive Rates (FPRs) for problems with class imbalance. Existing methods for learning under class imbalance most often do not take this into account. We argue that prediction accuracy should be improved by emphasizing the reduction of FPRs at high TPRs for problems where misclassification of the positive, i.e. critical, class samples are associated with higher cost.To this end, we pose the training of a DNN for binary classification as a constrained optimization problem and introduce a novel constraint that can be used with existing loss functions to enforce maximal area under the ROC curve (AUC) through prioritizing FPR reduction at high TPR. We solve the resulting constrained optimization problem using an Augmented Lagrangian method (ALM).Going beyond binary, we also propose two possible extensions of the proposed constraint for multi-class classification problems.We present experimental results for image-based binary and multi-class classification applications using an in-house medical imaging dataset, CIFAR10, and CIFAR100. Our results demonstrate that the proposed method improves the baselines in majority of the cases by attaining higher accuracy on critical classes while reducing the misclassification rate for the non-critical class samples.

constrained optimization, critical and under-represented class, train neural network, (11 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.58)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Constrained Optimization to Train Neural Networks on Critical and Under-Represented Classes

Neural Information Processing SystemsJan-19-2025, 08:08:29 GMT

Deep neural networks (DNNs) are notorious for making more mistakes for the classes that have substantially fewer samples than the others during training. Such class imbalance is ubiquitous in clinical applications and very crucial to handle because the classes with fewer samples most often correspond to critical cases (e.g., cancer) where misclassifications can have severe consequences.Not to miss such cases, binary classifiers need to be operated at high True Positive Rates (TPRs) by setting a higher threshold, but this comes at the cost of very high False Positive Rates (FPRs) for problems with class imbalance. Existing methods for learning under class imbalance most often do not take this into account. We argue that prediction accuracy should be improved by emphasizing the reduction of FPRs at high TPRs for problems where misclassification of the positive, i.e. critical, class samples are associated with higher cost.To this end, we pose the training of a DNN for binary classification as a constrained optimization problem and introduce a novel constraint that can be used with existing loss functions to enforce maximal area under the ROC curve (AUC) through prioritizing FPR reduction at high TPR. We solve the resulting constrained optimization problem using an Augmented Lagrangian method (ALM).Going beyond binary, we also propose two possible extensions of the proposed constraint for multi-class classification problems.We present experimental results for image-based binary and multi-class classification applications using an in-house medical imaging dataset, CIFAR10, and CIFAR100.

constrained optimization, critical and under-represented class, train neural network, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Reviews: The Effect of Network Width on the Performance of Large-batch Training

Neural Information Processing SystemsOct-8-2024, 08:26:33 GMT

It has been wide discussed on how to develop algorithms allow large batches, so that one could train neural networks in a distributed environment. The paper investigates the effect of network width on the performance of large-batch training both theoretically and experimentally. The authors claim that with the same number of parameters, it is more likely to train neural networks using proper large batches easily with a wide network architecture. The theoretical support on 2-layers linear/nonlinear networks and multilayer linear networks is also given. The paper is well-written and easy to follow.

large-batch training, neural network, performance, (10 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.50)

Add feedback

Machine learning reveals hidden components of X-ray pulses

#artificialintelligenceAug-7-2022, 07:55:23 GMT

Ultrafast pulses from X-ray lasers reveal how atoms move at timescales of a femtosecond. However, measuring the properties of the pulses themselves is challenging. While determining a pulse's maximum strength, or'amplitude,' is straightforward, the time at which the pulse reaches the maximum, or'phase,' is often hidden. A new study trains neural networks to analyze the pulse to reveal these hidden sub-components. Physicists also call these sub-components'real' and'imaginary.' Starting from low-resolution measurements, the neural networks reveal finer details with each pulse, and they can analyze pulses millions of times faster than previous methods.

application, experiment, x-ray pulse, (6 more...)

#artificialintelligence

Genre: Press Release (0.40)

Industry: Media > News (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.83)

Add feedback

Exploring Neural Networks Visually in the Browser

#artificialintelligenceMay-23-2022, 08:09:26 GMT

While teaching myself the basics of neural networks, I was finding it hard to bridge the gap between the foundational theory and a practical "feeling" of how neural networks function at a fundamental level. I learned how pieces like gradient descent and different activation functions worked, and I played with building and training some networks in a Google Colab notebook. Modern toolkits like Tensorflow handle the full pipeline from data preparation to training to testing and everything else you can think of - all behind extremely high-level, well-documented APIs. The power of these tools is obvious. Anyone can load, run, and play with state of the art deep learning architectures in GPU-accelerated Python notebooks instantly in the web browser.

activation function, neuron, target function, (15 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Why Is It So Hard To Train Neural Networks?

#artificialintelligenceNov-4-2021, 13:06:16 GMT

Neural networks are hard to train. The more they go deep, the more they are likely to suffer from unstable gradients. Gradients can either explode or vanish, and neither of those is a good thing for the training of our network. The vanishing gradients problem results in the network taking too long to train(learning will be very slow or completely die), and the exploding gradients cause the gradients to be very large. Although those problems are nearly inevitable, the choice of activation function can reduce their effects. Using ReLU activation in the first layers can help avoid vanishing gradients.

activation function, gradient, train neural network, (1 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

La veille de la cybersécurité

#artificialintelligenceSep-21-2021, 11:00:35 GMT

Neural network subspaces contain diverse solutions that can be ensembled, approaching the ensemble performance of independently trained networks without the training cost. Researchers have consistently demonstrated the positive correlation between the depth of a neural network and its capability to achieve high accuracy solutions. However, quantifying such an advantage still eludes the researchers, as it is still unclear how many layers one would need to make a certain prediction accurately. So, there is always a risk of incorporating complicated DNN architectures that exhaust the computational resources and make the whole training process a costly affair. Hence, removing neural network redundancy in whichever way possible led researchers to probe the abysses of neural network topology– subspaces.Interest in neural network subspaces has been prevalent for over a couple of decades now, but their significance has become more obvious with the increasing size of deep neural networks. Apple's machine learning team, especially, has recently showcased their work on neural network subspace at this year's ICML conference.

diverse solution, neural network, neural network subspace, (10 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

How to train Neural Networks

#artificialintelligenceOct-9-2020, 18:05:07 GMT

In this post, I am going to write about the general blueprint to be followed for any deep learning model. Here I am not going in-depth into deep learning concepts but this acts as a basic step that can be followed to develop neural networks. Some steps may be added or can be removed from the below list based on the requirement. The data we get for modeling is most of the time unstructured and raw, where we have lots of data that is not required for our case. The first step comes in modeling a neural network is weight initialization and this is an extremely important step because if the weights are not initialized properly then converging to minima is impossible, but if done is the right way then optimization is achieved in the least time.

artificial intelligence, deep learning, machine learning, (16 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.99)

Add feedback

Train Neural Networks Using a Genetic Algorithm in Python with PyGAD

#artificialintelligenceSep-25-2020, 23:55:23 GMT

The genetic algorithm (GA) is a biologically-inspired optimization algorithm. It has in recent years gained importance, as it's simple while also solving complex problems like travel route optimization, training machine learning algorithms, working with single and multi-objective problems, game playing, and more. Deep neural networks are inspired by the idea of how the biological brain works. It's a universal function approximator, which is capable of simulating any function, and is now used to solve the most complex problems in machine learning. What's more, they're able to work with all types of data (images, audio, video, and text).

artificial intelligence, evolutionary algorithm, machine learning, (11 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.97)

Add feedback

Train neural networks using AMD GPUs and Keras

#artificialintelligenceNov-25-2019, 06:01:51 GMT

AMD is developing a new HPC platform, called ROCm. Its ambition is to create a common, open-source environment, capable to interface both with Nvidia (using CUDA) and AMD GPUs (further information). This tutorial will explain how to set-up a neural network environment, using AMD GPUs in a single or multiple configurations. On the software side: we will be able to run Tensorflow v1.12.0 as a backend to Keras on top of the ROCm kernel, using Docker. To install and deploy ROCm are required particular hardware/software configurations.

amd gpus, amd gpus and keras, train neural network, (5 more...)

#artificialintelligence

Industry: Information Technology (0.43)

Technology: