ProxySPEX: Inference-Efficient Interpretability via Sparse Feature Interactions in LLMs

Jun-18-2026, 01:52:50 GMT–Neural Information Processing Systems

Large Language Models (LLMs) have achieved remarkable performance by capturing complex interactions between input features. To identify these interactions, most existing approaches require enumerating all possible combinations of features up to a given order, causing them to scale poorly with the number of inputs n. Recently, Kang et al. (2025) proposed SPEX, an information-theoretic approach that uses interaction sparsity to scale to n 103 features. SPEX greatly improves upon prior methods but requires tens of thousands of model inferences, which can be prohibitive for large models. In this paper, we observe that LLM feature interactions are often hierarchical--higher-order interactions are accompanied by their lower-order subsets--which enables more efficient discovery.

justification, large language model, natural language, (13 more...)

Neural Information Processing Systems

Jun-18-2026, 01:52:50 GMT

Conferences PDF

Add feedback

Country:
- North America > United States (0.46)

Industry:
- Health & Medicine (0.68)

Technology:
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found