Pre-trained Large Language Models Learn to Predict Hidden Markov Models In-context

Jun-11-2026, 08:22:00 GMT–Neural Information Processing Systems

Hidden Markov Models (HMMs) are fundamental tools for modeling sequential data with latent states that follow Markovian dynamics. However, they present significant challenges in model fitting and computational efficiency on real-world datasets. In this work, we demonstrate that pre-trained large language models (LLMs) can effectively model data generated by HMMs through in-context learning (ICL) -- their ability to learn patterns from examples within the input context. We evaluate LLMs' performance on diverse synthetic HMMs, showing that their prediction accuracy converges to the theoretical optimum. We discover novel scaling trends influenced by HMM properties and provide theoretical conjectures for these empirical observations.

large language model, machine learning, natural language, (6 more...)

Neural Information Processing Systems

Jun-11-2026, 08:22:00 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (0.91)
  - Machine Learning > Learning Graphical Models
    - Undirected Networks > Markov Models (1.00)