Fourier Head: Helping Large Language Models Learn Complex Probability Distributions

Open in new window