Conditional computation in neural networks: principles and research trends
Scardapane, Simone, Baiocchi, Alessandro, Devoto, Alessio, Marsocci, Valerio, Minervini, Pasquale, Pomponi, Jary
–arXiv.org Artificial Intelligence
In particular, we focus on neural networks that can dynamically activate or de-activate parts of their computational graph conditionally on their input. Examples include the dynamic selection of, e.g., input tokens, layers (or sets of layers), and sub-modules inside each layer (e.g., channels in a convolutional filter). We first provide a general formalism to describe these techniques in an uniform way. Then, we introduce three notable implementations of these principles: mixture-of-experts (MoEs) networks, token selection mechanisms, and early-exit neural networks. The paper aims to provide a tutorial-like introduction to this growing field. To this end, we analyze the benefits of these modular designs in terms of efficiency, explainability, and transfer learning, with a focus on emerging applicative areas ranging from automated scientific discovery to semantic communication.
arXiv.org Artificial Intelligence
Jul-8-2024
- Country:
- South America > Colombia
- Meta Department > Villavicencio (0.04)
- North America > United States
- Washington > King County > Seattle (0.04)
- Europe
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Belgium > Flanders
- Flemish Brabant > Leuven (0.04)
- Ireland > Leinster
- Asia
- Indonesia > Bali (0.04)
- Singapore (0.04)
- Middle East
- Jordan (0.04)
- UAE > Abu Dhabi Emirate
- Abu Dhabi (0.04)
- South America > Colombia
- Genre:
- Overview (0.93)
- Instructional Material > Course Syllabus & Notes (0.48)
- Technology: