AITopics | degradation problem

Collaborating Authors

degradation problem

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Masked Image Residual Learning for Scaling Deeper Vision Transformers

Neural Information Processing SystemsDec-26-2025, 14:14:09 GMT

Deeper Vision Transformers (ViTs) are more challenging to train. We expose a degradation problem in deeper layers of ViT when using masked image modeling (MIM) for pre-training.To ease the training of deeper ViTs, we introduce a self-supervised learning framework called $\textbf{M}$asked $\textbf{I}$mage $\textbf{R}$esidual $\textbf{L}$earning ($\textbf{MIRL}$), which significantly alleviates the degradation problem, making scaling ViT along depth a promising direction for performance upgrade. We reformulate the pre-training objective for deeper layers of ViT as learning to recover the residual of the masked image.We provide extensive empirical evidence showing that deeper ViTs can be effectively optimized using MIRL and easily gain accuracy from increased depth. With the same level of computational complexity as ViT-Base and ViT-Large, we instantiate $4.5{\times}$ and $2{\times}$ deeper ViTs, dubbed ViT-S-54 and ViT-B-48.The deeper ViT-S-54, costing $3{\times}$ less than ViT-Large, achieves performance on par with ViT-Large.ViT-B-48 achieves 86.2\% top-1 accuracy on ImageNet. On one hand, deeper ViTs pre-trained with MIRL exhibit excellent generalization capabilities on downstream tasks, such as object detection and semantic segmentation. On the other hand, MIRL demonstrates high pre-training efficiency. With less pre-training time, MIRL yields competitive performance compared to other approaches.

deeper vision transformer, masked image residual learning, scaling deeper vision transformer, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Masked Image Residual Learning for Scaling Deeper Vision Transformers

Neural Information Processing SystemsOct-9-2025, 05:16:32 GMT

Deeper Vision Transformers (ViTs) are more challenging to train.

arxiv preprint arxiv, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Masked Image Residual Learning for Scaling Deeper Vision Transformers

Neural Information Processing SystemsJan-19-2025, 19:57:55 GMT

Deeper Vision Transformers (ViTs) are more challenging to train. We expose a degradation problem in deeper layers of ViT when using masked image modeling (MIM) for pre-training.To ease the training of deeper ViTs, we introduce a self-supervised learning framework called \textbf{M} asked \textbf{I} mage \textbf{R} esidual \textbf{L} earning ( \textbf{MIRL}), which significantly alleviates the degradation problem, making scaling ViT along depth a promising direction for performance upgrade. We reformulate the pre-training objective for deeper layers of ViT as learning to recover the residual of the masked image.We provide extensive empirical evidence showing that deeper ViTs can be effectively optimized using MIRL and easily gain accuracy from increased depth. With the same level of computational complexity as ViT-Base and ViT-Large, we instantiate 4.5{\times} and 2{\times} deeper ViTs, dubbed ViT-S-54 and ViT-B-48.The deeper ViT-S-54, costing 3{\times} less than ViT-Large, achieves performance on par with ViT-Large.ViT-B-48 achieves 86.2\% top-1 accuracy on ImageNet. On one hand, deeper ViTs pre-trained with MIRL exhibit excellent generalization capabilities on downstream tasks, such as object detection and semantic segmentation.

deeper vision transformer, masked image residual learning, scaling deeper vision transformer, (6 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.98)

Add feedback

SSHPool: The Separated Subgraph-based Hierarchical Pooling

Xu, Zhuo, Cui, Lixin, Wang, Yue, Du, Hangyuan, Bai, Lu, Hancock, Edwin R.

arXiv.org Artificial IntelligenceMar-24-2024

In this paper, we develop a novel local graph pooling method, namely the Separated Subgraph-based Hierarchical Pooling (SSHPool), for graph classification. To this end, we commence by assigning the nodes of a sample graph into different clusters, resulting in a family of separated subgraphs. We individually employ a local graph convolution units as the local structure to further compress each subgraph into a coarsened node, transforming the original graph into a coarsened graph. Since these subgraphs are separated by different clusters and the structural information cannot be propagated between them, the local convolution operation can significantly avoid the over-smoothing problem arising in most existing Graph Neural Networks (GNNs). By hierarchically performing the proposed procedures on the resulting coarsened graph, the proposed SSHPool can effectively extract the hierarchical global feature of the original graph structure, encapsulating rich intrinsic structural characteristics. Furthermore, we develop an end-to-end GNN framework associated with the proposed SSHPool module for graph classification. Experimental results demonstrate the superior performance of the proposed model on real-world datasets, significantly outperforming state-of-the-art GNN methods in terms of the classification accuracies.

graph, node, sshpool, (17 more...)

arXiv.org Artificial Intelligence

2403.16133

Country:

Asia > China > Beijing > Beijing (0.04)
Europe > United Kingdom > England > North Yorkshire > York (0.04)
Asia > China > Shanxi Province > Taiyuan (0.04)

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Multi-Level Firing with Spiking DS-ResNet: Enabling Better and Deeper Directly-Trained Spiking Neural Networks

Feng, Lang, Liu, Qianhui, Tang, Huajin, Ma, De, Pan, Gang

arXiv.org Artificial IntelligenceApr-19-2023

Spiking neural networks (SNNs) are bio-inspired neural networks with asynchronous discrete and sparse characteristics, which have increasingly manifested their superiority in low energy consumption. Recent research is devoted to utilizing spatio-temporal information to directly train SNNs by backpropagation. However, the binary and non-differentiable properties of spike activities force directly trained SNNs to suffer from serious gradient vanishing and network degradation, which greatly limits the performance of directly trained SNNs and prevents them from going deeper. In this paper, we propose a multi-level firing (MLF) method based on the existing spatio-temporal back propagation (STBP) method, and spiking dormant-suppressed residual network (spiking DS-ResNet). MLF enables more efficient gradient propagation and the incremental expression ability of the neurons. Spiking DS-ResNet can efficiently perform identity mapping of discrete spikes, as well as provide a more suitable connection for gradient propagation in deep SNNs. With the proposed method, our model achieves superior performances on a non-neuromorphic dataset and two neuromorphic datasets with much fewer trainable parameters and demonstrates the great ability to combat the gradient vanishing and degradation problem in deep SNNs.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2210.06386

Country: Asia > China > Zhejiang Province > Hangzhou (0.04)

Genre: Research Report (0.82)

Industry: Energy (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Advancing Spiking Neural Networks towards Deep Residual Learning

Hu, Yifan, Deng, Lei, Wu, Yujie, Yao, Man, Li, Guoqi

arXiv.org Artificial IntelligenceMar-10-2023

Despite the rapid progress of neuromorphic computing, inadequate capacity and insufficient representation power of spiking neural networks (SNNs) severely restrict their application scope in practice. Residual learning and shortcuts have been evidenced as an important approach for training deep neural networks, but rarely did previous work assess their applicability to the characteristics of spike-based communication and spatiotemporal dynamics. In this paper, we first identify that this negligence leads to impeded information flow and the accompanying degradation problem in previous residual SNNs. To address this issue, we propose a novel SNN-oriented residual architecture termed MS-ResNet, which establishes membrane-based shortcut pathways, and further prove that the gradient norm equality can be achieved in MS-ResNet by introducing block dynamical isometry theory, which ensures the network can be well-behaved in a depth-insensitive way. Thus we are able to significantly extend the depth of directly trained SNNs, e.g., up to 482 layers on CIFAR-10 and 104 layers on ImageNet, without observing any slight degradation problem. To validate the effectiveness of MS-ResNet, experiments on both frame-based and neuromorphic datasets are conducted. MS-ResNet104 achieves a superior result of 76.02% accuracy on ImageNet, which is the highest to our best knowledge in the domain of directly trained SNNs. Great energy efficiency is also observed, with an average of only one spike per neuron needed to classify an input sample. We believe our powerful and scalable models will provide a strong support for further exploration of SNNs.

artificial intelligence, machine learning, neural network, (18 more...)

arXiv.org Artificial Intelligence

2112.08954

Country:

Asia > China > Beijing > Beijing (0.04)
Europe > Austria > Styria > Graz (0.04)
Asia > China > Shaanxi Province > Xi'an (0.04)
(2 more...)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Computer Vision With ResNet

#artificialintelligenceFeb-10-2023, 21:13:31 GMT

In December of 2015, a paper was published that rocked the deep learning world. This paper is widely regarded as one of the most influential papers in modern deep learning and has been cited over 110,000 times. The prevailing wisdom of the time suggested adding more layers to neural networks would lead to better results. But researchers observed that the accuracy of deep networks would increase up to a saturation point before levelling off. In addition to that, an unusual phenomenon was observed: Adding layers to an already deep network, the training error would actually increase.

computer vision, gradient, neural network, (9 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.62)

Add feedback

Probabilistic Diagnostic Tests for Degradation Problems in Supervised Learning

Valencia-Zapata, Gustavo A., Ersoy, Okan, Gonzalez-Canas, Carolina, Zentner, Michael G., Klimeck, Gerhard

arXiv.org Artificial IntelligenceApr-6-2020

Several studies point out different causes of performance degradation in supervised machine learning. Problems such as class imbalance, overlapping, small-disjuncts, noisy labels, and sparseness limit accuracy in classification algorithms. Even though a number of approaches either in the form of a methodology or an algorithm try to minimize performance degradation, they have been isolated efforts with limited scope. Most of these approaches focus on remediation of one among many problems, with experimental results coming from few datasets and classification algorithms, insufficient measures of prediction power, and lack of statistical validation for testing the real benefit of the proposed approach. This paper consists of two main parts: In the first part, a novel probabilistic diagnostic model based on identifying signs and symptoms of each problem is presented. Thereby, early and correct diagnosis of these problems is to be achieved in order to select not only the most convenient remediation treatment but also unbiased performance metrics. Secondly, the behavior and performance of several supervised algorithms are studied when training sets have such problems. Therefore, prediction of success for treatments can be estimated across classifiers.

dataset, degradation problem, subclass, (13 more...)

arXiv.org Artificial Intelligence

2004.02988

Country:

North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)
Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
(7 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)

Industry:

Health & Medicine > Therapeutic Area (0.67)
Health & Medicine > Health Care Technology (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
(4 more...)

Add feedback

Common architectures in convolutional neural networks.

#artificialintelligenceApr-24-2018, 02:27:09 GMT

In this post, I'll discuss commonly used architectures for convolutional networks. As you'll see, almost all CNN architectures follow the same general design principles of successively applying convolutional layers to the input, periodically downsampling the spatial dimensions while increasing the number of feature maps. While the classic network architectures were comprised simply of stacked convolutional layers, modern architectures explore new and innovative ways for constructing convolutional layers in a way which allows for more efficient learning. Almost all of these architectures are based on a repeatable unit which is used throughout the network. These architectures serve as general design guidelines which machine learning practitioners will then adapt to solve various computer vision tasks.

artificial intelligence, convolution, machine learning, (19 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.85)

Add feedback