Explanations that reveal all through the definition of encoding

Feb-17-2026, 15:20:51 GMT–Neural Information Processing Systems

Feature attributions attempt to highlight what inputs drive predictive power. Good attributions or explanations are thus those that produce inputs that retain this predictive power; accordingly, evaluations of explanations score their quality of prediction. However, evaluations produce scores better than what appears possible from the values in the explanation for a class of explanations, called encoding explanations. Probing for encoding remains a challenge because there is no general characterization of what gives the extra predictive power. We develop a definition of encoding that identifies this extra predictive power via conditional dependence and show that the definition fits existing examples of encoding. This definition implies, in contrast to encoding explanations, that non-encoding explanations contain all the informative inputs used to produce the explanation, giving them a "what you see is what you get" property, which makes them transparent and simple to use.

explanation, large language model, machine learning, (21 more...)

Neural Information Processing Systems

Feb-17-2026, 15:20:51 GMT

Conferences PDF

Add feedback

Country:
- Asia > Vietnam (0.04)

Genre:
- Research Report > Experimental Study (0.92)

Industry:
- Health & Medicine
  - Therapeutic Area > Cardiology/Vascular Diseases (1.00)
  - Diagnostic Medicine (1.00)

Technology:
- Information Technology
  - Data Science > Data Mining (0.67)
  - Artificial Intelligence
    - Representation & Reasoning (1.00)
    - Natural Language > Large Language Model (0.68)
    - Vision (0.67)
    - Machine Learning > Neural Networks
      - Deep Learning (0.93)

Duplicate Docs Excel Report

Title
Explanations that reveal all through the definition of encoding

Similar Docs Excel Report more

Title	Similarity	Source
None found