Aggregate and conquer: detecting and steering LLM concepts by combining nonlinear predictors over multiple layers

Open in new window