Local Linearity of LLMs Enables Activation Steering via Model-Based Linear Optimal Control

Skifstad, Julian, Yang, Xinyue Annie, Chou, Glen

Apr-22-2026–arXiv.org Machine Learning

Inference-time LLM alignment methods, particularly activation steering, offer an alternative to fine-tuning by directly modifying activations during generation. Existing methods, however, often rely on non-anticipative interventions that ignore how perturbations propagate through transformer layers and lack online error feedback, resulting in suboptimal, open-loop control. To address this, we show empirically that, despite the nonlinear structure of transformer blocks, layer-wise dynamics across multiple LLM architectures and scales are well-approximated by locally-linear models. Exploiting this property, we model LLM inference as a linear time-varying dynamical system and adapt the classical linear quadratic regulator to compute feedback controllers using layer-wise Jacobians, steering activations toward desired semantic setpoints in closed-loop with minimal computational overhead and no offline training. We also derive theoretical bounds on setpoint tracking error, enabling formal guarantees on steering performance. Using a novel adaptive semantic feature setpoint signal, our method yields robust, fine-grained behavior control across models, scales, and tasks, including state-of-the-art modulation of toxicity, truthfulness, refusal, and arbitrary concepts, surpassing baseline steering methods. Our code is available at: https://github.com/trustworthyrobotics/lqr-activation-steering

large language model, machine learning, natural language, (19 more...)

arXiv.org Machine Learning

Apr-22-2026

arXiv.org PDF

Add feedback

Country:
- Oceania > Australia (0.04)
- North America
  - United States
    - New Jersey > Hudson County
      - Hoboken (0.04)
    - Georgia > Fulton County
      - Atlanta (0.04)
  - Canada > British Columbia
    - Metro Vancouver Regional District > Vancouver (0.04)
- Europe
  - United Kingdom > England (0.04)
  - Romania > Sud - Muntenia Development Region
    - Giurgiu County > Giurgiu (0.04)
  - Italy > Tuscany
    - Florence (0.04)
- Asia > Thailand
  - Bangkok > Bangkok (0.04)

Genre:
- Research Report (1.00)

Industry:
- Energy (0.67)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.49)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found