Goto

Collaborating Authors

 Basel-City










Elo Uncovered: Robustness and Best Practices in Language Model Evaluation

Neural Information Processing Systems

However, while popular, the system's suitability for assessing entities with constant skill levels, such as LLMs, remains relatively unexplored. We study two fundamental axioms that evaluation methods should adhere to: reliability and transitivity .


Latent SDEs on Homogeneous Spaces

Neural Information Processing Systems

We consider the problem of variational Bayesian inference in a latent variable model where a (possibly complex) observed stochastic process is governed by the solution of a latent stochastic differential equation (SDE).