deep convolutional neural network
cd10c7f376188a4a2ca3e8fea2c03aeb-Paper.pdf
Global information is essential for dense prediction problems, whose goal is to compute adiscrete or continuous label for each pixel in the images. Traditional convolutional layers in neural networks, initially designed for image classification, are restrictive in these problems since the filter size limits their receptive fields. In this work, we propose to replace any traditional convolutional layer with an autoregressivemoving-average (ARMA) layer,anovelmodule with an adjustable receptive field controlled by the learnable autoregressive coefficients.
- North America > United States > Maryland (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Asia > China > Jiangsu Province > Nanjing (0.04)
Recurrence along Depth: Deep Convolutional Neural Networks with Recurrent Layer Aggregation
This paper introduces a concept of layer aggregation to describe how information from previous layers can be reused to better extract features at the current layer. While DenseNet is a typical example of the layer aggregation mechanism, its redundancy has been commonly criticized in the literature. This motivates us to propose a very light-weighted module, called recurrent layer aggregation (RLA), by making use of the sequential structure of layers in a deep CNN. Our RLA module is compatible with many mainstream deep CNNs, including ResNets, Xception and MobileNetV2, and its effectiveness is verified by our extensive experiments on image classification, object detection and instance segmentation tasks. Specifically, improvements can be uniformly observed on CIFAR, ImageNet and MS COCO datasets, and the corresponding RLA-Nets can surprisingly boost the performances by 2-3% on the object detection task. This evidences the power of our RLA module in helping main CNNs better learn structural information in images.
Understanding the Effective Receptive Field in Deep Convolutional Neural Networks
We study characteristics of receptive fields of units in deep convolutional networks. The receptive field size is a crucial issue in many visual tasks, as the output must respond to large enough areas in the image to capture information about large objects. We introduce the notion of an effective receptive field size, and show that it both has a Gaussian distribution and only occupies a fraction of the full theoretical receptive field size. We analyze the effective receptive field in several architecture designs, and the effect of sub-sampling, skip connections, dropout and nonlinear activations on it. This leads to suggestions for ways to address its tendency to be too small.
Global Gated Mixture of Second-order Pooling for Improving Deep Convolutional Neural Networks
In most of existing deep convolutional neural networks (CNNs) for classification, global average (first-order) pooling (GAP) has become a standard module to summarize activations of the last convolution layer as final representation for prediction. Recent researches show integration of higher-order pooling (HOP) methods clearly improves performance of deep CNNs. However, both GAP and existing HOP methods assume unimodal distributions, which cannot fully capture statistics of convolutional activations, limiting representation ability of deep CNNs, especially for samples with complex contents. To overcome the above limitation, this paper proposes a global Gated Mixture of Second-order Pooling (GM-SOP) method to further improve representation ability of deep CNNs. To this end, we introduce a sparsity-constrained gating mechanism and propose a novel parametric SOP as component of mixture model.
ResNet: Enabling Deep Convolutional Neural Networks through Residual Learning
Abstract--Convolutional Neural Networks (CNNs) have rev-olutionised computer vision, but training very deep networks has been challenging due to the vanishing gradient problem. This paper explores Residual Networks (ResNet), introduced by He et al. (2015), which overcome this limitation by using skip connections. ResNet enables the training of networks with hundreds of layers by allowing gradients to flow directly through shortcut connections that bypass intermediate layers. In our implementation on the CIF AR-10 dataset, ResNet-18 achieves 89.9% accuracy compared to 84.1% for a traditional deep CNN of similar depth, while also converging faster and training more stably. Deep Convolutional Neural Networks (CNNs) have become the foundation of modern computer vision, powering applications from image classification to object detection.
Sarcasm Detection Using Deep Convolutional Neural Networks: A Modular Deep Learning Framework
Abstract--Sarcasm is a nuanced and often misinterpreted form of communication, especially in text, where tone and body language are absent. This paper presents a proposed modular deep learning framework for sarcasm detection, leveraging Deep Convolutional Neural Networks (DCNNs) and contextual models like BERT to analyze linguistic, emotional, and contextual cues [1][2]. The system is conceptually designed to integrate sentiment analysis, contextual embeddings, linguistic feature extraction, and emotion detection through a multi-layer architecture. Although the model is not yet implemented, the design demonstrates feasibility for real-world applications like chatbots and social media monitoring [9][11]. Sarcasm detection is vital for enhancing the interpretabil-ity of automated systems like sentiment analyzers, chatbots, and recommendation engines [9][13].
Deep Convolutional Neural Network for Image Deconvolution
Many fundamental image-related problems involve deconvolution operators. Real blur degradation seldom complies with an deal linear convolution model due to camera noise, saturation, image compression, to name a few. Instead of perfectly modeling outliers, which is rather challenging from a generative model perspective, we develop a deep convolutional neural network to capture the characteristics of degradation. We note directly applying existing deep neural networks does not produce reasonable results. Our solution is to establish the connection between traditional optimization-based schemes and a neural network architecture where a novel, separable structure is introduced as a reliable support for robust deconvolution against artifacts. Our network contains two submodules, both trained in a supervised manner with proper initialization. They yield decent performance on non-blind image deconvolution compared to previous generative-model based methods.