Goto

Collaborating Authors

 logratio


Discovering Preference Optimization Algorithms with and for Large Language Models Chris Lu

Neural Information Processing Systems

Typically, preference optimization is approached as an offline supervised learning task using manually crafted convex loss functions. While these methods are based on theoretical insights, they are inherently constrained by human creativity, so the large search space of possible loss functions remains under-explored.



Discovering Preference Optimization Algorithms with and for Large Language Models

arXiv.org Artificial Intelligence

Offline preference optimization is a key method for enhancing and controlling the quality of Large Language Model (LLM) outputs. Typically, preference optimization is approached as an offline supervised learning task using manually-crafted convex loss functions. While these methods are based on theoretical insights, they are inherently constrained by human creativity, so the large search space of possible loss functions remains under explored. We address this by performing LLM-driven objective discovery to automatically discover new state-of-the-art preference optimization algorithms without (expert) human intervention. Specifically, we iteratively prompt an LLM to propose and implement new preference optimization loss functions based on previously-evaluated performance metrics. This process leads to the discovery of previously-unknown and performant preference optimization algorithms. The best performing of these we call Discovered Preference Optimization (DiscoPOP), a novel algorithm that adaptively blends logistic and exponential losses. Experiments demonstrate the state-of-the-art performance of DiscoPOP and its successful transfer to held-out tasks.


Assessing Keyness using Permutation Tests

arXiv.org Artificial Intelligence

We propose a resampling-based approach for assessing keyness in corpus linguistics based on suggestions by Gries (2006, 2022). Traditional approaches based on hypothesis tests (e.g. Likelihood Ratio) model the copora as independent identically distributed samples of tokens. This model does not account for the often observed uneven distribution of occurences of a word across a corpus. When occurences of a word are concentrated in few documents, large values of LLR and similar scores are in fact much more likely than accounted for by the token-by-token sampling model, leading to false positives. We replace the token-by-token sampling model by a model where corpora are samples of documents rather than tokens, which is much closer to the way corpora are actually assembled. We then use a permutation approach to approximate the distribution of a given keyness score under the null hypothesis of equal frequencies and obtain p-values for assessing significance. We do not need any assumption on how the tokens are organized within or across documents, and the approach works with basically *any* keyness score. Hence, appart from obtaining more accurate p-values for scores like LLR, we can also assess significance for e.g. the logratio which has been proposed as a measure of effect size. An efficient implementation of the proposed approach is provided in the `R` package `keyperm` available from github.


Three approaches to supervised learning for compositional data with pairwise logratios

arXiv.org Machine Learning

The common approach to compositional data analysis is to transform the data by means of logratios. Logratios between pairs of compositional parts (pairwise logratios) are the easiest to interpret in many research problems. When the number of parts is large, some form of logratio selection is a must, for instance by means of an unsupervised learning method based on a stepwise selection of the pairwise logratios that explain the largest percentage of the logratio variance in the compositional dataset. In this article we present three alternative stepwise supervised learning methods to select the pairwise logratios that best explain a dependent variable in a generalized linear model, each geared for a specific problem. The first method features unrestricted search, where any pairwise logratio can be selected. This method has a complex interpretation if some pairs of parts in the logratios overlap, but it leads to the most accurate predictions. The second method restricts parts to occur only once, which makes the corresponding logratios intuitively interpretable. The third method uses additive logratios, so that $K-1$ selected logratios involve exactly $K$ parts. This method in fact searches for the subcomposition with the highest explanatory power. Once the subcomposition is identified, the researcher's favourite logratio representation may be used in subsequent analyses, not only pairwise logratios. Our methodology allows logratios or non-compositional covariates to be forced into the models based on theoretical knowledge, and various stopping criteria are available based on information measures or statistical significance with the Bonferroni correction. We present an illustration of the three approaches on a dataset from a study predicting Crohn's disease. The first method excels in terms of predictive power, and the other two in interpretability.