ssim value
Trustworthiness of Stochastic Gradient Descent in Distributed Learning
Li, Hongyang, Wu, Caesar, Chadli, Mohammed, Mammar, Said, Bouvry, Pascal
DL is the method used to accelerate the training of deep learning models by distributing training tasks to multiple computing nodes [1]. However, as data scales continue to grow, the complexity of model gradients increases accordingly, for example, consider the training of deep learning on ImageNet [2], which contains over 14 million labeled images and topics with approximately 22,000 categories, leading to constraints on communication efficiency [3]. Gradient compression aimed at reducing communication overhead during gradient transmission between multiple nodes which enhances system computational efficiency [4, 5, 6], thus this has emerged as an effective optimization technique in distributed learning, especially when training complex models to process large-scale data. Among various gradient compression techniques, PowerSGD [6] and Top-K SGD [7] have emerged as prominent solutions for their ability to substantially reduce communication costs while preserving scalability and model accuracy in large-scale distributed learning. These two algorithms are particularly suitable for our study as they represent fundamental approaches to gradient compression: PowerSGD uses low-rank approximation, while TopKSGD leverages sparsification through threshold quantization. Both techniques are widely recognized for their practical effectiveness, especially when combined, to varying extents, with advanced features such as error feedback, warm start, all-reduce, making them ideal candidates of compressed SGD for assessing privacy risks in distributed deep learning systems.
Visual Episodic Memory-based Exploration
Vice, Jack, Ruiz-Sanchez, Natalie, Douglas, Pamela K., Sukthankar, Gita
In humans, intrinsic motivation is an important mechanism for open-ended cognitive development; in robots, it has been shown to be valuable for exploration. An important aspect of human cognitive development is $\textit{episodic memory}$ which enables both the recollection of events from the past and the projection of subjective future. This paper explores the use of visual episodic memory as a source of intrinsic motivation for robotic exploration problems. Using a convolutional recurrent neural network autoencoder, the agent learns an efficient representation for spatiotemporal features such that accurate sequence prediction can only happen once spatiotemporal features have been learned. Structural similarity between ground truth and autoencoder generated images is used as an intrinsic motivation signal to guide exploration. Our proposed episodic memory model also implicitly accounts for the agent's actions, motivating the robot to seek new interactive experiences rather than just areas that are visually dissimilar. When guiding robotic exploration, our proposed method outperforms the Curiosity-driven Variational Autoencoder (CVAE) at finding dynamic anomalies.
Nonconvex Robust High-Order Tensor Completion Using Randomized Low-Rank Approximation
Qin, Wenjin, Wang, Hailin, Zhang, Feng, Ma, Weijun, Wang, Jianjun, Huang, Tingwen
Within the tensor singular value decomposition (T-SVD) framework, existing robust low-rank tensor completion approaches have made great achievements in various areas of science and engineering. Nevertheless, these methods involve the T-SVD based low-rank approximation, which suffers from high computational costs when dealing with large-scale tensor data. Moreover, most of them are only applicable to third-order tensors. Against these issues, in this article, two efficient low-rank tensor approximation approaches fusing randomized techniques are first devised under the order-d (d >= 3) T-SVD framework. On this basis, we then further investigate the robust high-order tensor completion (RHTC) problem, in which a double nonconvex model along with its corresponding fast optimization algorithms with convergence guarantees are developed. To the best of our knowledge, this is the first study to incorporate the randomized low-rank approximation into the RHTC problem. Empirical studies on large-scale synthetic and real tensor data illustrate that the proposed method outperforms other state-of-the-art approaches in terms of both computational efficiency and estimated precision.
Generative Adversarial Network for Personalized Art Therapy in Melanoma Disease Management
Jรผtte, Lennart, Wang, Ning, Roth, Bernhard
Melanoma is the most lethal type of skin cancer. Patients are vulnerable to mental health illnesses which can reduce the effectiveness of the cancer treatment and the patients adherence to drug plans. It is crucial to preserve the mental health of patients while they are receiving treatment. However, current art therapy approaches are not personal and unique to the patient. We aim to provide a well-trained image style transfer model that can quickly generate unique art from personal dermoscopic melanoma images as an additional tool for art therapy in disease management of melanoma. Visual art appreciation as a common form of art therapy in disease management that measurably reduces the degree of psychological distress. We developed a network based on the cycle-consistent generative adversarial network for style transfer that generates personalized and unique artworks from dermoscopic melanoma images. We developed a model that converts melanoma images into unique flower-themed artworks that relate to the shape of the lesion and are therefore personal to the patient. Further, we altered the initial framework and made comparisons and evaluations of the results. With this, we increased the options in the toolbox for art therapy in disease management of melanoma. The development of an easy-to-use user interface ensures the availability of the approach to stakeholders. The transformation of melanoma into flower-themed artworks is achieved by the proposed model and the graphical user interface. This contribution opens a new field of GANs in art therapy and could lead to more personalized disease management.
Automated SSIM Regression for Detection and Quantification of Motion Artefacts in Brain MR Images
Sciarra, Alessandro, Chatterjee, Soumick, Dรผnnwald, Max, Placidi, Giuseppe, Nรผrnberger, Andreas, Speck, Oliver, Oeltze-Jafra, Steffen
Motion artefacts in magnetic resonance brain images can have a strong impact on diagnostic confidence. The assessment of MR image quality is fundamental before proceeding with the clinical diagnosis. Motion artefacts can alter the delineation of structures such as the brain, lesions or tumours and may require a repeat scan. Otherwise, an inaccurate (e.g. correct pathology but wrong severity) or incorrect diagnosis (e.g. wrong pathology) may occur. "\textit{Image quality assessment}" as a fast, automated step right after scanning can assist in deciding if the acquired images are diagnostically sufficient. An automated image quality assessment based on the structural similarity index (SSIM) regression through a residual neural network is proposed in this work. Additionally, a classification into different groups - by subdividing with SSIM ranges - is evaluated. Importantly, this method predicts SSIM values of an input image in the absence of a reference ground truth image. The networks were able to detect motion artefacts, and the best performance for the regression and classification task has always been achieved with ResNet-18 with contrast augmentation. The mean and standard deviation of residuals' distribution were $\mu=-0.0009$ and $\sigma=0.0139$, respectively. Whilst for the classification task in 3, 5 and 10 classes, the best accuracies were 97, 95 and 89\%, respectively. The results show that the proposed method could be a tool for supporting neuro-radiologists and radiographers in evaluating image quality quickly.
Conditional Progressive Generative Adversarial Network for satellite image generation
Cardoso, Renato, Vallecorsa, Sofia, Nemni, Edoardo
Image generation and image completion are rapidly evolving fields, thanks to machine learning algorithms that are able to realistically replace missing pixels. However, generating large high resolution images, with a large level of details, presents important computational challenges. In this work, we formulate the image generation task as completion of an image where one out of three corners is missing. We then extend this approach to iteratively build larger images with the same level of detail. Our goal is to obtain a scalable methodology to generate high resolution samples typically found in satellite imagery data sets. We introduce a conditional progressive Generative Adversarial Networks (GAN), that generates the missing tile in an image, using as input three initial adjacent tiles encoded in a latent vector by a Wasserstein auto-encoder. We focus on a set of images used by the United Nations Satellite Centre (UNOSAT) to train flood detection tools, and validate the quality of synthetic images in a realistic setup.
SSIM-Variation-Based Complexity Optimization for Versatile Video Coding
Lin, Jielian, Lin, Hongbin, Zhang, Zhichen, Xu, Yiwen, Zhao, Tiesong
To date, Versatile Video Coding (VVC) has a more magnificent overall performance than High Efficiency Video Coding (HEVC). The Quadtree with Nested Multi-Type Tree (QTMT) coding block structure can substantially enhance video coding quality in VVC. However, the coding gain also leads to a greater coding complexity. Therefore, this letter proposes a Fast Decision Scheme Based on Structural Similarity Index Metric Variation (FDS-SSIMV) to solve this problem. Firstly, the Structural Similarity Index Metric Variation (SSIMV) characteristic among the sub coding units of the spit mode is illustrated. Next, to evaluate the SSIMV value, SSIMV measure strategies are designed for different split modes in this letter. Then, the desired split modes are selected by the SSIMV values. Experimental results show that the proposed method achieves 64.74\% average encoding Time Saving (TS) with a 2.79\% Bj$\varnothing$ntegaard Delta Bit Rate (BDBR), outperforming the benchmarks.
Perceptually Constrained Adversarial Attacks
Hameed, Muhammad Zaid, Gyorgy, Andras
Motivated by previous observations that the usually applied $L_p$ norms ($p=1,2,\infty$) do not capture the perceptual quality of adversarial examples in image classification, we propose to replace these norms with the structural similarity index (SSIM) measure, which was developed originally to measure the perceptual similarity of images. Through extensive experiments with adversarially trained classifiers for MNIST and CIFAR-10, we demonstrate that our SSIM-constrained adversarial attacks can break state-of-the-art adversarially trained classifiers and achieve similar or larger success rate than the elastic net attack, while consistently providing adversarial images of better perceptual quality. Utilizing SSIM to automatically identify and disallow adversarial images of low quality, we evaluate the performance of several defense schemes in a perceptually much more meaningful way than was done previously in the literature.