activation map
FFAM: Feature Factorization Activation Map for Explanation of 3D Detectors
LiDAR-based 3D object detection has made impressive progress recently, yet most existing models are black-box, lacking interpretability. Previous explanation approaches primarily focus on analyzing image-based models and are not readily applicable to LiDAR-based 3D detectors. In this paper, we propose a feature factorization activation map (FFAM) to generate high-quality visual explanations for 3D detectors. FFAM employs non-negative matrix factorization to generate concept activation maps and subsequently aggregates these maps to obtain a global visual explanation. To achieve object-specific visual explanations, we refine the global visual explanation using the feature gradient of a target object. Additionally, we introduce a voxel upsampling strategy to align the scale between the activation map and input point cloud. We qualitatively and quantitatively analyze FFAM with multiple detectors on several datasets. Experimental results validate the high-quality visual explanations produced by FFAM.
Appendix for " CS-Isolate: Extracting Hard Confident Examples by Content and Style Isolation " Y exiong Lin 1 Y u Y ao
We denote observed variables with gray color and latent variables with white color. Firstly, we introduce the concept of an uncontrolled style factor . Why do confident examples encourage content-style isolation? Calculate the loss using Eq. 1 and update networks; Output: The inference networks and classifier heads q It's essential to understand that although data augmentation cannot control all style factors, it still offers the benefit of "partial isolation". This approach, therefore, ensures that styles changes don't affect the derived content representation Calculate the loss using Eq. 2 and update networks; Output: The inference networks and classifier heads q Finally, confident and unlabeled examples are used to train the models based on the MixMatch algorithm.
Predicts HumanVisualSelectivity
The 1For our experiments we are counting the number of AMTHuman Intelligence Tasks (HITs) that were completed. Wedid not exclude AMT workers from completing multiple HITs. The authors posit that this noisiness is because the gradient may fluctuate sharply at small scales, which seems plausible especially given that, duetoReLUactivationfunctions, theoutput generally isnotevencontinuously differentiable. ThisCAM indicates the discriminative regions of the image used by the CNN to identify that class. We used each of the above passive attention methods to acquire attention maps from each of the modelsinthetoppartofTable2.