PRKAN: Parameter-Reduced Kolmogorov-Arnold Networks

Ta, Hoang-Thang, Thai, Duy-Quy, Tran, Anh, Sidorov, Grigori, Gelbukh, Alexander

arXiv.org Artificial Intelligence 

MLPs have been one of key components in modern neural network architectures for years. Their simplicity makes them widely used for capturing complex relationships through multiple layers of non-linear transformations. However, their role has been reconsidered recently with the revival of Kolmogorov-Arnold Networks (KANs) [1, 2]. In these papers, fixed activation functions used in MLPs are described as "nodes," and the authors proposed replacing them with learnable activation functions like B-splines, referred to as "edges", to improve performance in mathematical and physical examples. To address Hilbert's 13th problem [3], the Kolmogorov-Arnold Representation Theorem (KART) [4] was introduced. It posits that any continuous function involving multiple variables can be decomposed into a sum of continuous functions of single variables, thus inspiring the creation of KANs. The work of Liu et al. [1] on KANs has inspired numerous studies exploring the use of various basis and polynomial functions as replacements for B-splines [5, 6, 7, 8, 9, 10, 11, 12, 13], investigating the model's performance compared to MLPs. Several studies have shown that KANs do not always outperform MLPs when using the same training parameters [14, 15]. Moreover, while KANs achieve better performance than MLPs with the same network structure, they often require a significantly larger number of parameters [7, 16, 17, 18, 19].

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found