Convolutions Through the Lens of Tensor Networks

Jul-5-2023–arXiv.org Artificial Intelligence

Despite their simple intuition, convolutions are more tedious to analyze than dense layers, which complicates the generalization of theoretical and algorithmic ideas. We provide a new perspective onto convolutions through tensor networks (TNs) which allow reasoning about the underlying tensor multiplications by drawing diagrams, and manipulating them to perform function transformations, sub-tensor access, and fusion. We demonstrate this expressive power by deriving the diagrams of various autodiff operations and popular approximations of second-order information with full hyper-parameter support, batching, channel groups, and generalization to arbitrary convolution dimensions. Further, we provide convolution-specific transformations based on the connectivity pattern which allow to re-wire and simplify diagrams before evaluation. Finally, we probe computational performance, relying on established machinery for efficient TN contraction. Our TN implementation speeds up a recently-proposed KFAC variant up to 4.5 x and enables new hardware-efficient tensor dropout for approximate backpropagation. Despite the success of transformers [68], CNNs continue to be widely used and show competitive performance when incorporating architecture modernizations [41; 40] and attention [30; 70; 11; 39]. While the intuition behind convolution is simple to understand with graphical illustrations such as in Dumoulin & Visin [20], convolutions are more challenging to analyze than fully-connected layers in multi-layer perceptrons (MLPs). One reason is that the operation is hard to express in matrix expressions and--even when switching to index notation--compact expressions that are convenient to work with only exist for special hyper-parameter choices [e.g. The many hyper-parameters of convolution and additional features like channel groups [35] introduce additional complexity. And related objects like (higher-order) derivatives and related routines for autodiff inherit this complexity. TNs express tensor multiplications as diagrams (Figure 1).

artificial intelligence, conv, machine learning, (20 more...)

arXiv.org Artificial Intelligence

Jul-5-2023

arXiv.org PDF

Add feedback

Country:
- North America > Canada
  - Ontario > Toronto (0.04)
- Asia > Myanmar
  - Tanintharyi Region > Dawei (0.04)

Genre:
- Research Report (0.49)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.93)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found