Hallucinate or Memorize? The Two Sides of Probabilistic Learning in Large Language Models