Why are hyperbolic neural networks effective? A study on hierarchical representation capability
Tan, Shicheng, Zhao, Huanjing, Zhao, Shu, Zhang, Yanping
–arXiv.org Artificial Intelligence
Hyperbolic Neural Networks (HNNs), operating in hyperbolic space, have been widely applied in recent years, motivated by the existence of an optimal embedding in hyperbolic space that can preserve data hierarchical relationships (termed Hierarchical Representation Capability, HRC) more accurately than Euclidean space. However, there is no evidence to suggest that HNNs can achieve this theoretical optimal embedding, leading to much research being built on flawed motivations. In this paper, we propose a benchmark for evaluating HRC and conduct a comprehensive analysis of why HNNs are effective through large-scale experiments. Inspired by the analysis results, we propose several pre-training strategies to enhance HRC and improve the performance of downstream tasks, further validating the reliability of the analysis. Experiments show that HNNs cannot achieve the theoretical optimal embedding. The HRC is significantly affected by the optimization objectives and hierarchical structures, and enhancing HRC through pre-training strategies can significantly improve the performance of HNNs. However, existing research on hyperbolic space performance only proves the minimum distortion of embedding in hyperbolic space in theory (Sala et al., 2018; Tabaghi & Dokmanić, 2020) Figure 1: In theory, there exists an optimal embedding and does not prove that any method used in hyperbolic for hierarchical data in hyperbolic space, but space has the best HRC. HNNs can be affected by various factors and may (2021) theoretically demonstrated that the effectiveness not necessarily achieve the optimal embedding. of hyperbolic space is only limited to Therefore, the effectiveness of HNNs cannot simply ideal noiseless settings, and less data and imbalanced be attributed to the HRC of hyperbolic spaces. Especially for specific HNN methods, their performance will obviously be affected by optimization objectives and data. Agibetov et al. (2019) has noticed the phenomenon that classifiers in hyperbolic spaces are inferior to Euclidean spaces.
arXiv.org Artificial Intelligence
Feb-4-2024