$\texttt{SEM-CTRL}$: Semantically Controlled Decoding

Albinhassan, Mohammad, Madhyastha, Pranava, Russo, Alessandra

Mar-6-2025–arXiv.org Artificial Intelligence

Ensuring both syntactic and semantic correctness in Large Language Model (LLM) outputs remains a significant challenge, despite being critical for real-world deployment. In this paper, we introduce $\texttt{SEM-CTRL}$, a unified approach that enforces rich context-sensitive constraints and task- and instance-specific semantics directly on an LLM decoder. Our approach integrates token-level MCTS, which is guided by specific syntactic and semantic constraints. The constraints over the desired outputs are expressed using Answer Set Grammars -- a logic-based formalism that generalizes context-sensitive grammars while incorporating background knowledge to represent task-specific semantics. We show that our approach guarantees correct completions for any off-the-shelf LLM without the need for fine-tuning. We evaluate $\texttt{SEM-CTRL}$ on a range of tasks, including synthetic grammar synthesis, combinatorial reasoning, and planning. Our results demonstrate that $\texttt{SEM-CTRL}$ allows small pre-trained LLMs to efficiently outperform larger variants and state-of-the-art reasoning models (e.g., o1-preview) while simultaneously guaranteeing solution correctness.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

Mar-6-2025

arXiv.org PDF

Add feedback

Country:
- Asia (0.67)
- Europe > United Kingdom
  - England > Greater London > London (0.14)
- North America
  - Mexico > Mexico City (0.14)
  - United States (0.92)

Genre:
- Research Report > New Finding (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.47)
  - Natural Language
    - Grammars & Parsing (1.00)
    - Large Language Model (1.00)