MLP-Mixer: An all-MLP Architecture for Vision

May-27-2025, 04:13:52 GMT–Neural Information Processing Systems

Convolutional Neural Networks (CNNs) are the go-to model for computer vision. Recently, attention-based networks, such as the Vision Transformer, have also become popular. In this paper we show that while convolutions and attention are both sufficient for good performance, neither of them are necessary. We present MLP-Mixer, an architecture based exclusively on multi-layer perceptrons (MLPs). MLP-Mixer contains two types of layers: one with MLPs applied independently to image patches (i.e.

all-mlp architecture, image understanding, machine learning, (3 more...)

Neural Information Processing Systems

May-27-2025, 04:13:52 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Vision > Image Understanding (0.91)
  - Machine Learning > Neural Networks
    - Perceptrons (0.65)