Encoding architecture algebra
Bersier, Stephane, Chen-Lin, Xinyi
–arXiv.org Artificial Intelligence
There is growing awareness of the importance of designing model architectures that capture and respect the distinct structure of input data. Many successful deep learning architectures, 2 such as transformers [1], convolutional neural networks (CNNs)[2], graph neural networks (GNNs) [3], and recurrent neural networks (RNNs)[4], inherently incorporate aspects of data structure. Ongoing research focuses on refining existing architectures, as well as designing new ones for other types of structured data. For instance, DeepSets [5] are tailored to process sets, group and gauge equivariant CNNs [6][7] respect both global and local symmetries in the data, and strongly-typed RNNs [8] incorporate explicit types within recurrent networks. By accounting for the structure of the input data, these model architectures exhibit improved performance, better generalization with fewer parameters, and enhanced interpretability.
arXiv.org Artificial Intelligence
Oct-15-2024