Reviews: Fooling Neural Network Interpretations via Adversarial Model Manipulation
–Neural Information Processing Systems
Originality: as far as I am aware, the idea of adversarial *model* manipulation is a new one, and their citation of related work, e.g. Quality: although I have confidence that the submission is technically sound, I think the experiments are insufficient and missing important categories of model/explanation method. I elaborate on this below. Clarity: the paper seems fairly clearly written and I'm confident that expert readers could reproduce its results. Significance: I think the strongest selling point of the work is the core idea -- adversarial model manipulation might have significant practical implications.
Neural Information Processing Systems
Jan-25-2025, 02:20:20 GMT
- Technology: