eigen-distortion
Eigen-Distortions of Hierarchical Representations
We develop a method for comparing hierarchical image representations in terms of their ability to explain perceptual sensitivity in humans. Specifically, we utilize Fisher information to establish a model-derived prediction of sensitivity to local perturbations of an image. For a given image, we compute the eigenvectors of the Fisher information matrix with largest and smallest eigenvalues, corresponding to the model-predicted most-and least-noticeable image distortions, respectively. For human subjects, we then measure the amount of each distortion that can be reliably detected when added to the image. We use this method to test the ability of a variety of representations to mimic human perceptual sensitivity. We find that the early layers of VGG16, a deep neural network optimized for object recognition, provide a better match to human perception than later layers, and a better match than a 4-stage convolutional neural network (CNN) trained on a database of human ratings of distorted image quality. On the other hand, we find that simple models of early visual processing, incorporating one or more stages of local gain control, trained on the same database of distortion ratings, provide substantially better predictions of human sensitivity than either the CNN, or any combination of layers of VGG16.
Reviews: Eigen-Distortions of Hierarchical Representations
The submission presents a method to generate image distortions that are maximally/minimally discriminable in a certain image representation. The maximally/minimally distortion directions are defined as the eigenvectors of the Fisher Information Matrix with largest/smallest eigenvalue. Distortions are generated for image representations in the VGG-16 as well as for representations in models that were trained to predict human sensitivity to image distortions. Human discrimination thresholds for those distortions are measured. It is found that the difference in human discrimination threshold between max and min distortions of the model is largest for a biologically inspired'early vision' model that was trained to predict human sensitivity, compared to a CNN trained to predict human sensitivity or the VGG-16 representations. For the VGG representations it is found that the difference in detection threshold for humans is larger for min/max distortions of earlier layers than for later layers.
Eigen-Distortions of Hierarchical Representations
Berardino, Alexander, Laparra, Valero, Ballé, Johannes, Simoncelli, Eero
We develop a method for comparing hierarchical image representations in terms of their ability to explain perceptual sensitivity in humans. Specifically, we utilize Fisher information to establish a model-derived prediction of sensitivity to local perturbations of an image. For a given image, we compute the eigenvectors of the Fisher information matrix with largest and smallest eigenvalues, corresponding to the model-predicted most- and least-noticeable image distortions, respectively. For human subjects, we then measure the amount of each distortion that can be reliably detected when added to the image. We use this method to test the ability of a variety of representations to mimic human perceptual sensitivity.