egp
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)
Position: Don't be Afraid of Over-Smoothing And Over-Squashing
Kormann, Niklas, Doerr, Benjamin, Lutzeyer, Johannes F.
Over-smoothing and over-squashing have been extensively studied in the literature on Graph Neural Networks (GNNs) over the past years. We challenge this prevailing focus in GNN research, arguing that these phenomena are less critical for practical applications than assumed. We suggest that performance decreases often stem from uninformative receptive fields rather than over-smoothing. We support this position with extensive experiments on several standard benchmark datasets, demonstrating that accuracy and over-smoothing are mostly uncorrelated and that optimal model depths remain small even with mitigation techniques, thus highlighting the negligible role of over-smoothing. Similarly, we challenge that over-squashing is always detrimental in practical applications. Instead, we posit that the distribution of relevant information over the graph frequently factorises and is often localised within a small k-hop neighbourhood, questioning the necessity of jointly observing entire receptive fields or engaging in an extensive search for long-range interactions. The results of our experiments show that architectural interventions designed to mitigate over-squashing fail to yield significant performance gains. This position paper advocates for a paradigm shift in theoretical research, urging a diligent analysis of learning tasks and datasets using statistics that measure the underlying distribution of label-relevant information to better understand their localisation and factorisation.
- Asia > Middle East > Oman (0.04)
- Europe > France (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.92)
1cb524b5a3f3f82be4a7d954063c07e2-AuthorFeedback.pdf
We thank all reviewers for their careful reading and useful comments. In particular, we notice Assumption 3.1 and Assumption 3.3 drew most attentions. SGD converges (and recovers the true noise) in correlated settings. Section 4.3.1 in Rasmussen, C. E. (2003), "Gaussian processes in machine learning", which is commonly We will include these results in the camera-ready version if accepted. First we explain the role of "correlation": here we mean the correlation of (k 1) (k 1) (k 1) ( k 1) In order to have a lower bound on the approximate curvature (see Lemma 4.2) that is independent of The proof for Lemma 4.1 is also non-trivial since the error bounds holds uniformly for all We apply Taylor's expansion and a novel truncation technique to avoid the difficulty caused
Probabilistic and nonlinear compressive sensing
Barth, Lukas Silvester, von Petersenn, Paulo
We present a smooth probabilistic reformulation of $\ell_0$ regularized regression that does not require Monte Carlo sampling and allows for the computation of exact gradients, facilitating rapid convergence to local optima of the best subset selection problem. The method drastically improves convergence speed compared to similar Monte Carlo based approaches. Furthermore, we empirically demonstrate that it outperforms compressive sensing algorithms such as IHT and (Relaxed-) Lasso across a wide range of settings and signal-to-noise ratios. The implementation runs efficiently on both CPUs and GPUs and is freely available at https://github.com/L0-and-behold/probabilistic-nonlinear-cs. We also contribute to research on nonlinear generalizations of compressive sensing by investigating when parameter recovery of a nonlinear teacher network is possible through compression of a student network. Building upon theorems of Fefferman and Markel, we show theoretically that the global optimum in the infinite-data limit enforces recovery up to certain symmetries. For empirical validation, we implement a normal-form algorithm that selects a canonical representative within each symmetry class. However, while compression can help to improve test loss, we find that exact parameter recovery is not even possible up to symmetries. In particular, we observe a surprising rebound effect where teacher and student configurations initially converge but subsequently diverge despite continuous decrease in test loss. These findings indicate fundamental differences between linear and nonlinear compressive sensing.
- North America > United States > New York > New York County > New York City (0.14)
- Europe > Poland (0.04)
- Europe > Germany > Saxony > Leipzig (0.04)
- (2 more...)
Error-Guided Pose Augmentation: Enhancing Rehabilitation Exercise Assessment through Targeted Data Generation
Effective rehabilitation assessment is essential for monitoring patient progress, particularly in home-based settings. Existing systems often face challenges such as data imbalance and difficulty detecting subtle movement errors. This paper introduces Error-Guided Pose Augmentation (EGPA), a method that generates synthetic skeleton data by simulating clinically relevant movement mistakes. Unlike standard augmentation techniques, EGPA targets biomechanical errors observed in rehabilitation. Combined with an attention-based graph convolutional network, EGPA improves performance across multiple evaluation metrics. Experiments demonstrate reductions in mean absolute error of up to 27.6 percent and gains in error classification accuracy of 45.8 percent. Attention visualizations show that the model learns to focus on clinically significant joints and movement phases, enhancing both accuracy and interpretability. EGPA offers a promising approach for improving automated movement quality assessment in both clinical and home-based rehabilitation contexts.
- Africa > Middle East > Egypt > Giza Governorate > Giza (0.04)
- North America > United States > Idaho (0.04)
Review for NeurIPS paper: Stochastic Gradient Descent in Correlated Settings: A Study on Gaussian Processes
Additional Feedback: I'd like to see main paper Figure 1 / supplementary figure 4.1 expanded. The two questions I have that I don't think the figure currently answers are (1) how does the variance in final \sigma {2}_{f} across trials compare to a full batch GP, and (2) if full batch GPs have smaller variance, do much larger batch sizes (e.g., say m 1000) decrease this variance further? In figure 4.1, it does not seem the variance decreases much from m 16 to m 64 -- it'd be nice to know whether the batch size is the source of the variance. If it is, then running with very large batch sizes even up to m 10000 may not be too challenging. To the point of running large batch sizes, while the ability to use SGD will clearly outperform full batch training at some size N (at a guess, probably somewhere in the the N 100k-500k range), I don't think the results in Table 1 are necessarily representative of the settings you might actually want to run sgGP or EGP with.