Goto

Collaborating Authors

 anilla


A Training and

Neural Information Processing Systems

All models were trained on single GPUs, except for SchNet when trained on OC20-2M, which required 3 GPUs. Tables 9-12 present the extended results on OC20 across the 4 separate S2EF validation sets. Table 9: Evaluation results on the OC20 S2EF in-distribution validation set. In Table 13, we present the performance and inference throughput of the baseline models on COLL. Table 13: Evaluation of the performance of the four baseline models on the COLL dataset.Inference COLL test set Throughput Samples / Energy MAE Force MAE Force cos EFwT Model GPU sec.



Finite-Sample Analysis of Off-Policy TD-Learning via Generalized Bellman Operators

Neural Information Processing Systems

It is known that policy evaluation has the interpretation of solving a generalized Bellman equation. In this paper, we derive finite-sample bounds for any general off-policy TD-like stochastic approximation algorithm that solves for the fixed-point of this generalized Bellman operator.


Finite-Sample Analysis of Off-Policy TD-Learning via Generalized Bellman Operators

Neural Information Processing Systems

It is known that policy evaluation has the interpretation of solving a generalized Bellman equation. In this paper, we derive finite-sample bounds for any general off-policy TD-like stochastic approximation algorithm that solves for the fixed-point of this generalized Bellman operator.





Table R1

Neural Information Processing Systems

ISO can perform model adaptation with a batch of instances, if available. Comparison based on faster version of ISO will be added in the revision. During inference, it is updated using Eqn. We will add more details in the revision. ISO makes marginal improvement over Joint (from 42.6 to 41.8 in MPJPE) since training and testing However, even though there is no significant distribution shift, ISO still makes positive effect.


Synthetic Eggs in Many Baskets: The Impact of Synthetic Data Diversity on LLM Fine-Tuning

Schaffelder, Max, Gatt, Albert

arXiv.org Artificial Intelligence

As synthetic data becomes widely used in language model development, understanding its impact on model behavior is crucial. This paper investigates the impact of the diversity of sources of synthetic data on fine-tuned large language models. We focus on three key dimensions: distribution collapse, adversarial robustness, and self-preference bias. Our findings reveal that fine-tuning models on synthetic data from diverse sources can mitigate distribution collapse, preserving the breadth of the output distribution and the diversity of the output text. Furthermore, while both human and synthetic fine-tuning data can remove safeguards, the latter preserves higher output quality, thus making outputs potentially more usable and dangerous. Finally, fine-tuning reduces self-preference bias, with human data being the most effective, followed by multi-source synthetic data.


High-Power Training Data Identification with Provable Statistical Guarantees

Liu, Zhenlong, Zeng, Hao, Huang, Weiran, Wei, Hongxin

arXiv.org Artificial Intelligence

The conventional approaches treat it as a simple binary classification task without statistical guarantees. A recent approach is designed to control the false discovery rate (FDR), but its guarantees rely on strong, easily violated assumptions. In this paper, we introduce Provable Training Data Identification (PTDI), a rigorous method that identifies a set of training data with strict false discovery rate (FDR) control. Specifically, our method computes p-values for each data point using a set of known unseen data, and then constructs a conservative estimator for the data usage proportion of the test set, which allows us to scale these p-values. Our approach then selects the final set of training data by identifying all points whose scaled p-values fall below a data-dependent threshold. This entire procedure enables the discovery of training data with provable, strict FDR control and significantly boosted power. Extensive experiments across a wide range of models (LLMs and VLMs), and datasets demonstrate that PTDI strictly controls the FDR and achieves higher power. These concerns raise the importance of identifying a specific, well-defined set of data allegedly used in training. To resolve such high-stakes disputes, claims must be supported by credible evidence that strictly controls the risk of false positives. This underscores the need for methods that provide rigorous statistical guarantees for identifying training data.