Goto

Collaborating Authors

 robust deep learning


Adversarial Distributional Training for Robust Deep Learning

Neural Information Processing Systems

Adversarial training (AT) is among the most effective techniques to improve model robustness by augmenting training data with adversarial examples. However, most existing AT methods adopt a specific attack to craft adversarial examples, leading to the unreliable robustness against other unseen attacks. Besides, a single attack algorithm could be insufficient to explore the space of perturbations. In this paper, we introduce adversarial distributional training (ADT), a novel framework for learning robust models. ADT is formulated as a minimax optimization problem, where the inner maximization aims to learn an adversarial distribution to characterize the potential adversarial examples around a natural one under an entropic regularizer, and the outer minimization aims to train robust models by minimizing the expected loss over the worst-case adversarial distributions. Through a theoretical analysis, we develop a general algorithm for solving ADT, and present three approaches for parameterizing the adversarial distributions, ranging from the typical Gaussian distributions to the flexible implicit ones. Empirical results on several benchmarks validate the effectiveness of ADT compared with the state-of-the-art AT methods.


Robust Deep Learning for Myocardial Scar Segmentation in Cardiac MRI with Noisy Labels

Moafi, Aida, Moafi, Danial, Mirkes, Evgeny M., McCann, Gerry P., Alatrany, Abbas S., Arnold, Jayanth R., Ghazi, Mostafa Mehdipour

arXiv.org Artificial Intelligence

The accurate segmentation of myocardial scars from cardiac MRI is essential for clinical assessment and treatment planning. In this study, we propose a robust deep-learning pipeline for fully automated myocardial scar detection and segmentation by fine-tuning state-of-the-art models. The method explicitly addresses challenges of label noise from semi-automatic annotations, data heterogeneity, and class imbalance through the use of Kullback-Leibler loss and extensive data augmentation. We evaluate the model's performance on both acute and chronic cases and demonstrate its ability to produce accurate and smooth segmentations despite noisy labels. In particular, our approach outperforms state-of-the-art models like nnU-Net and shows strong generalizability in an out-of-distribution test set, highlighting its robustness across various imaging conditions and clinical tasks. These results establish a reliable foundation for automated myocardial scar quantification and support the broader clinical adoption of deep learning in cardiac imaging.


Review for NeurIPS paper: Adversarial Distributional Training for Robust Deep Learning

Neural Information Processing Systems

Thank you for your submission to NeurIPS. After discussion, the reviewers are all in agreement that the proposed method does present an interesting and significant addition to the literature on adversarial training. The one criticism that the reviewers raised, that the method did not compare to the current state of the art in standard adversarial training, was well-addressed by the author response, and I'd strongly encourage them to include these results in the final version.


Review for NeurIPS paper: Adversarial Distributional Training for Robust Deep Learning

Neural Information Processing Systems

Additional Feedback: I thought the method was very cool! One thing that I thought the paper was doing (which turned out to be a misunderstanding, I think) is relaxing the l2 adversarial constraint bit. This is more of an intuition (and did not affect my review in any way), but to some extent is seems like if what one cares about is L2-adversarial robustness, then maximizing the inner loss with PGD is in some sense going to be "optimal"/hard-to-beat (some results in the Madry et al paper corroborate this, few-step PGD is pretty good at finding the best maxima we can find in general.) On the other hand, what you have is a weaker adversary (the distributional one entropic regularizer), but it has the advantage of being a potentially structured way of enforcing a better constraint than L2 robustness. Again this isn't part of my review, but it would be cool to see if it is possible to define a new robustness constraint that is explicitly tailored to your learned adversary (e.g.


Adversarial Distributional Training for Robust Deep Learning

Neural Information Processing Systems

Adversarial training (AT) is among the most effective techniques to improve model robustness by augmenting training data with adversarial examples. However, most existing AT methods adopt a specific attack to craft adversarial examples, leading to the unreliable robustness against other unseen attacks. Besides, a single attack algorithm could be insufficient to explore the space of perturbations. In this paper, we introduce adversarial distributional training (ADT), a novel framework for learning robust models. ADT is formulated as a minimax optimization problem, where the inner maximization aims to learn an adversarial distribution to characterize the potential adversarial examples around a natural one under an entropic regularizer, and the outer minimization aims to train robust models by minimizing the expected loss over the worst-case adversarial distributions.


Robust deep learning from weakly dependent data

Kengne, William, Wade, Modou

arXiv.org Machine Learning

Recent developments on deep learning established some theoretical properties of deep neural networks estimators. However, most of the existing works on this topic are restricted to bounded loss functions or (sub)-Gaussian or bounded input. This paper considers robust deep learning from weakly dependent observations, with unbounded loss function and unbounded input/output. It is only assumed that the output variable has a finite $r$ order moment, with $r >1$. Non asymptotic bounds for the expected excess risk of the deep neural network estimator are established under strong mixing, and $\psi$-weak dependence assumptions on the observations. We derive a relationship between these bounds and $r$, and when the data have moments of any order (that is $r=\infty$), the convergence rate is close to some well-known results. When the target predictor belongs to the class of H\"older smooth functions with sufficiently large smoothness index, the rate of the expected excess risk for exponentially strongly mixing data is close to or as same as those for obtained with i.i.d. samples. Application to robust nonparametric regression and robust nonparametric autoregression are considered. The simulation study for models with heavy-tailed errors shows that, robust estimators with absolute loss and Huber loss function outperform the least squares method.


Robust Deep Learning from Crowds with Belief Propagation

Kim, Hoyoung, Cho, Seunghyuk, Kim, Dongwoo, Ok, Jungseul

arXiv.org Artificial Intelligence

Crowdsourcing systems enable us to collect noisy labels from crowd workers. A graphical model representing local dependencies between workers and tasks provides a principled way of reasoning over the true labels from the noisy answers. However, one needs a predictive model working on unseen data directly from crowdsourced datasets instead of the true labels in many cases. To infer true labels and learn a predictive model simultaneously, we propose a new data-generating process, where a neural network generates the true labels from task features. We devise an EM framework alternating variational inference and deep learning to infer the true labels and to update the neural network, respectively. Experimental results with synthetic and real datasets show a belief-propagation-based EM algorithm is robust to i) corruption in task features, ii) multi-modal or mismatched worker prior, and iii) few spammers submitting noises to many tasks.


DeepMoM: Robust Deep Learning With Median-of-Means

Huang, Shih-Ting, Lederer, Johannes

arXiv.org Machine Learning

Data used in deep learning is notoriously problematic. For example, data are usually combined from diverse sources, rarely cleaned and vetted thoroughly, and sometimes corrupted on purpose. Intentional corruption that targets the weak spots of algorithms has been studied extensively under the label of "adversarial attacks." In contrast, the arguably much more common case of corruption that reflects the limited quality of data has been studied much less. Such "random" corruptions are due to measurement errors, unreliable sources, convenience sampling, and so forth. These kinds of corruption are common in deep learning, because data are rarely collected according to strict protocols -- in strong contrast to the formalized data collection in some parts of classical statistics. This paper concerns such corruption. We introduce an approach motivated by very recent insights into median-of-means and Le Cam's principle, we show that the approach can be readily implemented, and we demonstrate that it performs very well in practice. In conclusion, we believe that our approach is a very promising alternative to standard parameter training based on least-squares and cross-entropy loss.


Risk Bounds for Robust Deep Learning

Lederer, Johannes

arXiv.org Artificial Intelligence

It has been observed that certain loss functions can render deep-learning pipelines robust against flaws in the data. In this paper, we support these empirical findings with statistical theory. We especially show that empirical-risk minimization with unbounded, Lipschitz-continuous loss functions, such as the least-absolute deviation loss, Huber loss, Cauchy loss, and Tukey's biweight loss, can provide efficient prediction under minimal assumptions on the data. More generally speaking, our paper provides theoretical evidence for the benefits of robust loss functions in deep learning.


Meta Transition Adaptation for Robust Deep Learning with Noisy Labels

Shu, Jun, Zhao, Qian, Xu, Zongben, Meng, Deyu

arXiv.org Machine Learning

To discover intrinsic inter-class transition probabilities underlying data, learning with noise transition has become an important approach for robust deep learning on corrupted labels. Prior methods attempt to achieve such transition knowledge by pre-assuming strongly confident anchor points with 1-probability belonging to a specific class, generally infeasible in practice, or directly jointly estimating the transition matrix and learning the classifier from the noisy samples, always leading to inaccurate estimation misguided by wrong annotation information especially in large noise cases. To alleviate these issues, this study proposes a new meta-transition-learning strategy for the task. Specifically, through the sound guidance of a small set of meta data with clean labels, the noise transition matrix and the classifier parameters can be mutually ameliorated to avoid being trapped by noisy training samples, and without need of any anchor point assumptions. Besides, we prove our method is with statistical consistency guarantee on correctly estimating the desired transition matrix. Extensive synthetic and real experiments validate that our method can more accurately extract the transition matrix, naturally following its more robust performance than prior arts. Its essential relationship with label distribution learning is also discussed, which explains its fine performance even under no-noise scenarios.