MorphPool: Efficient Non-linear Pooling & Unpooling in CNNs

Groenendijk, Rick, Dorst, Leo, Gevers, Theo

Nov-25-2022–arXiv.org Artificial Intelligence

Contemporary deep learning architectures exploit pooling operations for two reasons: to filter impactful activation values from feature maps, and to reduce spatial feature size [28]. The most used pooling operation is the max pool, which is used in nearly all common network architectures such as ResNet [14], VGGNet [32], and DenseNet [16]. These network architectures can be applied to pixel-level prediction tasks, such as semantic segmentation. To do so, inputs are down-sampled to a set of latent features of small spatial size, after which they are up-sampled to full resolution again. Up-sampling from pooled feature sets most often happens with a combination of unpooling and deconvolution [41, 42] and is used in seminal works such as [3, 22, 26]. As will be shown in this paper, down-sampling using max pooling can be formalised and improved using mathematical morphology, the mathematics of contact. Ever since the works of Serra [29], the underlying algebraic structure of data that is acquired using probing contact (e.g. LiDAR and radar) has been known to the computer vision community [5, 11, 25, 33]. It is different from the algebra of linear diffusion that is used to build convolutional neural networks (CNNs).

artificial intelligence, machine learning, opération, (17 more...)

arXiv.org Artificial Intelligence

Nov-25-2022

arXiv.org PDF

Add feedback

Country:
- Europe (0.28)

Genre:
- Research Report (0.50)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)
  - Vision (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found