Goto

Collaborating Authors

 loss term


Embedding Space Interpolation Beyond Mini-Batch, Beyond Pairs and Beyond Examples

Neural Information Processing Systems

Mixup refers to interpolation-based data augmentation, originally motivated as a way to go beyond empirical risk minimization (ERM). Its extensions mostly focus on the definition of interpolation and the space (input or embedding) where it takes place, while the augmentation process itself is less studied. In most methods, the number of generated examples is limited to the mini-batch size and the number of examples being interpolated is limited to two (pairs), in the input space. We make progress in this direction by introducing MultiMix, which generates an arbitrarily large number of interpolated examples beyond the mini-batch size, and interpolates the entire mini-batch in the embedding space.



OptimizingRelevanceMapsofVision TransformersImprovesRobustness

Neural Information Processing Systems

It has been observed that visual classification models often rely mostly on spurious cues such as the image background, which hurts their robustness to distribution changes. To alleviate this shortcoming, we propose to monitor the model's relevancy signal and direct the model to base its prediction on the foregroundobject.






A Unified Hard-Constraint Framework for Solving Geometrically Complex PDEs

Neural Information Processing Systems

We present a unified hard-constraint framework for solving geometrically complex PDEs with neural networks, where the most commonly used Dirichlet, Neumann, and Robin boundary conditions (BCs) are considered. Specifically, we first introduce the extra fields'' from the mixed finite element method to reformulate the PDEs so as to equivalently transform the three types of BCs into linear forms. Based on the reformulation, we derive the general solutions of the BCs analytically, which are employed to construct an ansatz that automatically satisfies the BCs. With such a framework, we can train the neural networks without adding extra loss terms and thus efficiently handle geometrically complex PDEs, alleviating the unbalanced competition between the loss terms corresponding to the BCs and PDEs. We theoretically demonstrate that the extra fields'' can stabilize the training process. Experimental results on real-world geometrically complex PDEs showcase the effectiveness of our method compared with state-of-the-art baselines.


Inferring Cosmological Parameters with Evidential Physics-Informed Neural Networks

arXiv.org Artificial Intelligence

We examine the use of a novel variant of Physics-Informed Neural Networks to predict cosmological parameters from recent supernovae and baryon acoustic oscillations (BAO) datasets. Our machine learning framework generates uncertainty estimates for target variables and the inferred unknown parameters of the underlying PDE descriptions. Built upon a hybrid of the principles of Evidential Deep Learning, Physics-Informed Neural Networks, Bayesian Neural Networks and Gaussian Processes, our model enables learning of the posterior distribution of the unknown PDE parameters through standard gradient-descent based training. We apply our model to an up-to-date BAO dataset (Bousis et al. 2024) calibrated with the CMB-inferred sound horizon, and the Pantheon$+$ Sne Ia distances (Scolnic et al. 2018), examining the relative effectiveness and mutual consistency among the standard $Λ$CDM, $w$CDM and $Λ_s$CDM models. Unlike previous results arising from the standard approach of minimizing an appropriate $χ^2$ function, the posterior distributions for parameters in various models trained purely on Pantheon$+$ data were found to be largely contained within the $2σ$ contours of their counterparts trained on BAO data. Their posterior medians for $h_0$ were within about $2σ$ of one another, indicating that our machine learning-guided approach provides a different measure of the Hubble tension.


Evaluating Low-Light Image Enhancement Across Multiple Intensity Levels

arXiv.org Artificial Intelligence

Imaging in low-light environments is challenging due to reduced scene radiance, which leads to elevated sensor noise and reduced color saturation. Most learning-based low-light enhancement methods rely on paired training data captured under a single low-light condition and a well-lit reference. The lack of radiance diversity limits our understanding of how enhancement techniques perform across varying illumination intensities. W e introduce the Multi-Illumination Low-Light (MILL) dataset, containing images captured at diverse light intensities under controlled conditions with fixed camera settings and precise illuminance measurements. MILL enables comprehensive evaluation of enhancement algorithms across variable lighting conditions. W e benchmark several state-of-the-art methods and reveal significant performance variations across intensity levels. Leveraging the unique multi-illumination structure of our dataset, we propose improvements that enhance robustness across diverse illumination scenarios. Our modifications achieve up to 10 dB PSNR improvement for DSLR and 2 dB for the smartphone on Full HD images.