convolutional neural fabric
Convolutional Neural Fabrics
Despite the success of CNNs, selecting the optimal architecture for a given task remains an open problem. Instead of aiming to select a single optimal architecture, we propose a ``fabric'' that embeds an exponentially large number of architectures. The fabric consists of a 3D trellis that connects response maps at different layers, scales, and channels with a sparse homogeneous local connectivity pattern.
Reviews: Convolutional Neural Fabrics
The idea is quite interesting and timely: eliminating some of the many hyperparameters that need to be tuned in CNN design is a welcome development with potentially high impact. I like how Figure 5 demonstrates that the learning process is indeed capable of configuring the trellis as needed. It is somewhat unfortunate that all experiments in the paper are conducted with fabrics with a constant number of channels per scale: this goes against common practice in designing CNNs and leads to wasted capacity. It means that the size of the representation shrinks as the level of abstraction increases, which is typically counteracted by having more channels at higher abstraction levels. The paper states that experiments with channel doubling are ongoing, but these should really be part of the paper as they are much more relevant.
Convolutional Neural Fabrics
Saxena, Shreyas, Verbeek, Jakob
Despite the success of CNNs, selecting the optimal architecture for a given task remains an open problem. Instead of aiming to select a single optimal architecture, we propose a fabric'' that embeds an exponentially large number of architectures. The fabric consists of a 3D trellis that connects response maps at different layers, scales, and channels with a sparse homogeneous local connectivity pattern. While individual architectures can be recovered as paths, the fabric can in addition ensemble all embedded architectures together, sharing their weights where their paths overlap. Parameters can be learned using standard methods based on back-propagation, at a cost that scales linearly in the fabric size.
DeepMind's PathNet: A Modular Deep Learning Architecture for AGI – Intuition Machine
Unlike more traditional monolithic DL networks, PathNet reuses a network that consists of many neural networks and trains them to perform multiple tasks. In the authors experiments, they have shown that a network trained on a second task learns faster than if the network was trained from scratch. This indicates that transfer learning (or knowledge reuse) can be leveraged in this kind of a network. PathNet includes aspects of transfer learning, continual learning and multitask learning. These are aspects that are essential for a more continuously adaptive network and thus an approach that may lead to an AGI (speculative).