Stealing the Decoding Algorithms of Language Models

Naseh, Ali, Krishna, Kalpesh, Iyyer, Mohit, Houmansadr, Amir

arXiv.org Artificial Intelligence 

GPT-2 [40], GPT-3 [4] and GPT-Neo [3] have been shown to generate high-quality texts for these tasks. To generate a sequence of A key component of generating text from modern language models tokens, LMs produce a probability distribution over the vocabulary (LM) is the selection and tuning of decoding algorithms. These algorithms at each time step, from which the predicted token is drawn. Enumerating determine how to generate text from the internal probability all possible output sequences for a given input and choosing distribution generated by the LM. The process of choosing a decoding the one with the highest probability is intractable; furthermore, algorithm and tuning its hyperparameters takes significant relatively low-probability sequences may even be desirable for certain time, manual effort, and computation, and it also requires extensive tasks (e.g., creative writing). Therefore, LMs rely on decoding human evaluation. Therefore, the identity and hyperparameters of algorithms to decide which output tokens to produce based on their such decoding algorithms are considered to be extremely valuable probabilities, i.e., to decode the text.