Goto

Collaborating Authors

 evaluation result





A Training and

Neural Information Processing Systems

All models were trained on single GPUs, except for SchNet when trained on OC20-2M, which required 3 GPUs. Tables 9-12 present the extended results on OC20 across the 4 separate S2EF validation sets. Table 9: Evaluation results on the OC20 S2EF in-distribution validation set. In Table 13, we present the performance and inference throughput of the baseline models on COLL. Table 13: Evaluation of the performance of the four baseline models on the COLL dataset.Inference COLL test set Throughput Samples / Energy MAE Force MAE Force cos EFwT Model GPU sec.


An Empirical Survey of Model Merging Algorithms for Social Bias Mitigation

Shirafuji, Daiki, Saito, Tatsuhiko, Kimura, Yasutomo

arXiv.org Artificial Intelligence

Large language models (LLMs) are known to inherit and even amplify societal biases present in their pre-training corpora, threatening fairness and social trust. To address this issue, recent work has explored ``editing'' LLM parameters to mitigate social bias with model merging approaches; however, there is no empirical comparison. In this work, we empirically survey seven algorithms: Linear, Karcher Mean, SLERP, NuSLERP, TIES, DELLA, and Nearswap, applying 13 open weight models in the GPT, LLaMA, and Qwen families. We perform a comprehensive evaluation using three bias datasets (BBQ, BOLD, and HONEST) and measure the impact of these techniques on LLM performance in downstream tasks of the SuperGLUE benchmark. We find a trade-off between bias reduction and downstream performance: methods achieving greater bias mitigation degrade accuracy, particularly on tasks requiring reading comprehension and commonsense and causal reasoning. Among the merging algorithms, Linear, SLERP, and Nearswap consistently reduce bias while maintaining overall performance, with SLERP at moderate interpolation weights emerging as the most balanced choice. These results highlight the potential of model merging algorithms for bias mitigation, while indicating that excessive debiasing or inappropriate merging methods may lead to the degradation of important linguistic abilities.




Generalized Inequality-based Approach for Probabilistic WCET Estimation

Toba, Hayate, Yano, Atsushi, Azumi, Takuya

arXiv.org Machine Learning

Estimating the probabilistic Worst-Case Execution Time (pWCET) is essential for ensuring the timing correctness of real-time applications, such as in robot IoT systems and autonomous driving systems. While methods based on Extreme Value Theory (EVT) can provide tight bounds, they suffer from model uncertainty due to the need to decide where the upper tail of the distribution begins. Conversely, inequality-based approaches avoid this issue but can yield pessimistic results for heavy-tailed distributions. This paper proposes a method to reduce such pessimism by incorporating saturating functions (arctangent and hyperbolic tangent) into Chebyshev's inequality, which mitigates the influence of large outliers while preserving mathematical soundness. Evaluations on synthetic and real-world data from the Autoware autonomous driving stack demonstrate that the proposed method achieves safe and tighter bounds for such distributions.


Appendix A Model and training procedure: details

Neural Information Processing Systems

All experiments used the same model and training procedure, unless stated otherwise. ResNet with two blocks per group and channels per group (16, 32, 32, 64), and which was not pre-trained. The integer labels were embedded using a standard embedding layer. In all figures, (shaded) error bars indicate standard deviation around the mean. However, as future extensions, it would be possible to extend the model to handle novel labels as well.


Adversarially Robust 3D Point Cloud Recognition Using Self-Supervisions Supplementary Materials Jiachen Sun

Neural Information Processing Systems

Figure A. DGCNN leverages EdgeConv as their basic operation to extract features. Please refer to our codebase for detailed parameters like batch normalization and activation functions. A.2 Self-Supervised Learning T ask We follow exactly the same setting as Poursaeed et al. [8] and Sauder et al. [9] for 3D rotation and We illustrate the FoldingNet architecture in Figure C. In this section, we first introduce the detailed formulations of the adopted attack methods. B.1 Attack Method We introduce the detailed formulation of attack methods used in our study.