Ray-Tracing for Conditionally Activated Neural Networks

Feb-20-2025–arXiv.org Artificial Intelligence

A BSTRACT In this paper, we introduce a novel architecture for conditionally activated neural networks combining a hierarchical construction of multiple Mixture of Experts (MoEs) layers with a sampling mechanism that progressively converges to an optimized configuration of expert activation. This methodology enables the dynamic unfolding of the network's architecture, facilitating efficient path-specific training. Experimental results demonstrate that this approach achieves competitive accuracy compared to conventional baselines while significantly reducing the parameter count required for inference. The approach we propose implements a neural network where blocks (experts) are stacked over multiple layers. By expressing each block's output as the expected firing rate of a stochastic calculation path, we can simultaneously solve the inference and the selective activation problems. Importantly, since we model every block's output to be its expected activation rate, initiating a computational path from the input nodes or from within a block in the middle of the network will yield comparable results, allowing for a variety of new computational approaches, balancing the width-versus depth-first paradigm.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

Feb-20-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.29)

Genre:
- Research Report > New Finding (0.67)

Industry:
- Energy > Oil & Gas > Upstream (0.40)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)