AtP*: An efficient and scalable method for localizing LLM behaviour to components

Open in new window