Goto

Collaborating Authors

 normalisation


HyNet: Learning Local Descriptor with Hybrid Similarity Measure and Triplet Loss

Neural Information Processing Systems

In this paper, we investigate how L2 normalisation affects the back-propagated descriptor gradients during training. Based on our observations, we propose HyNet, a new local descriptor that leads to state-of-the-art results in matching. HyNet introduces a hybrid similarity measure for triplet margin loss, a regularisation term constraining the descriptor norm, and a new network architecture that performs L2 normalisation of all intermediate feature maps and the output descriptors. HyNet surpasses previous methods by a significant margin on standard benchmarks that include patch matching, verification, and retrieval, as well as outperforming full end-to-end methods on 3D reconstruction tasks.


Physics-Informed Neural ODEs with Scale-Aware Residuals for Learning Stiff Biophysical Dynamics

Kainth, Kamalpreet Singh, Joshi, Prathamesh Dinesh, Dandekar, Raj Abhijit, Dandekar, Rajat, Panat, Sreedat

arXiv.org Artificial Intelligence

Neural differential equations offer a powerful framework for modeling continuous-time dynamics, but forecasting stiff biophysical systems remains unreliable. Standard Neural ODEs and physics informed variants often require orders of magnitude more iterations, and even then may converge to suboptimal solutions that fail to preserve oscillatory frequency or amplitude. We introduce PhysicsInformed Neural ODEs with with Scale-Aware Residuals (PI-NODE-SR), a framework that combines a low-order explicit solver (Heun method) residual normalisation to balance contributions between state variables evolving on disparate timescales. This combination stabilises training under realistic iteration budgets and avoids reliance on computationally expensive implicit solvers. On the Hodgkin-Huxley equations, PI-NODE-SR learns from a single oscillation simulated with a stiff solver (Rodas5P) and extrapolates beyond 100 ms, capturing both oscillation frequency and near-correct amplitudes. Remarkably, end-to-end learning of the vector field enables PI-NODE-SR to recover morphological features such as sharp subthreshold curvature in gating variables that are typically reserved for higher-order solvers, suggesting that neural correction can offset numerical diffusion. While performance remains sensitive to initialisation, PI-NODE-SR consistently reduces long-horizon errors relative to baseline Neural-ODEs and PINNs, offering a principled route to stable and efficient learning of stiff biological dynamics.




Paper 1932

Neural Information Processing Systems

We thank the reviewers for their work and feedback. A-D related to the main contributions from R1, R2, R4 and then the specific ones. HPatches and find an increase of 0.57 and 0.21 for illumination and viewpoint, respectively. New results will be added to Table 3 to further expose the improvements, namely HardNet+FRN: 51.89 (+1.33 We will clarify in Sec.


Limitations of Normalization in Attention Mechanism

Mudarisov, Timur, Burtsev, Mikhail, Petrova, Tatiana, State, Radu

arXiv.org Artificial Intelligence

This paper investigates the limitations of the normalization in attention mechanisms. We begin with a theoretical framework that enables the identification of the model's selective ability and the geometric separation involved in token selection. Our analysis includes explicit bounds on distances and separation criteria for token vectors under softmax scaling. Through experiments with pre-trained GPT-2 model, we empirically validate our theoretical results and analyze key behaviors of the attention mechanism. Notably, we demonstrate that as the number of selected tokens increases, the model's ability to distinguish informative tokens declines, often converging toward a uniform selection pattern. We also show that gradient sensitivity under softmax normalization presents challenges during training, especially at low temperature settings. These findings advance current understanding of softmax-based attention mechanism and motivate the need for more robust normalization and selection strategies in future attention architectures.


FRACCO: A gold-standard annotated corpus of oncological entities with ICD-O-3.1 normalisation

Pignat, Johann, Vucetic, Milena, Gaudet-Blavignac, Christophe, Zaghir, Jamil, Stettler, Amandine, Amrein, Fanny, Bonjour, Jonatan, Goldman, Jean-Philippe, Michielin, Olivier, Lovis, Christian, Bjelogrlic, Mina

arXiv.org Artificial Intelligence

Developing natural language processing tools for clinical text requires annotated datasets, yet French oncology resources remain scarce. We present FRACCO (FRench Annotated Corpus for Clinical Oncology) an expert-annotated corpus of 1301 synthetic French clinical cases, initially translated from the Spanish CANTEMIST corpus as part of the FRASIMED initiative. Each document is annotated with terms related to morphology, topography, and histologic differentiation, using the International Classification of Diseases for Oncology (ICD-O) as reference. An additional annotation layer captures composite expression-level normalisations that combine multiple ICD-O elements into unified clinical concepts. Annotation quality was ensured through expert review: 1301 texts were manually annotated for entity spans by two domain experts. A total of 71127 ICD-O normalisations were produced through a combination of automated matching and manual validation by a team of five annotators. The final dataset representing 399 unique morphology codes (from 2549 different expressions), 272 topography codes (from 3143 different expressions), and 2043 unique composite expressions (from 11144 different expressions). This dataset provides a reference standard for named entity recognition and concept normalisation in French oncology texts.



Paper 1932

Neural Information Processing Systems

We thank the reviewers for their work and feedback. A-D related to the main contributions from R1, R2, R4 and then the specific ones. HPatches and find an increase of 0.57 and 0.21 for illumination and viewpoint, respectively. New results will be added to Table 3 to further expose the improvements, namely HardNet+FRN: 51.89 (+1.33 We will clarify in Sec.


Guided Uncertainty Learning Using a Post-Hoc Evidential Meta-Model

Barker, Charmaine, Bethell, Daniel, Gerasimou, Simos

arXiv.org Artificial Intelligence

Reliable uncertainty quantification remains a major obstacle to the deployment of deep learning models under distributional shift. Existing post-hoc approaches that retrofit pretrained models either inherit misplaced confidence or merely reshape predictions, without teaching the model when to be uncertain. We introduce GUIDE, a lightweight evidential learning meta-model approach that attaches to a frozen deep learning model and explicitly learns how and when to be uncertain. GUIDE identifies salient internal features via a calibration stage, and then employs these features to construct a noise-driven curriculum that teaches the model how and when to express uncertainty. GUIDE requires no retraining, no architectural modifications, and no manual intermediate-layer selection to the base deep learning model, thus ensuring broad applicability and minimal user intervention. The resulting model avoids distilling overconfidence from the base model, improves out-of-distribution detection by ~77% and adversarial attack detection by ~80%, while preserving in-distribution performance. Across diverse benchmarks, GUIDE consistently outperforms state-of-the-art approaches, evidencing the need for actively guiding uncertainty to close the gap between predictive confidence and reliability.