Knowledge distillation is a popular approach for enhancing the performance of "student" models, with lower representational capacity, by taking advantage of
Taking an example of hyperspectral image reconstruction, the spectral SCI [Gehm et al., 2007] can fast capture and compress 3D hyperspectral signals as
Beginning with the simplification of flattening all actions, we theoretically explore the discrepancies between action-level optimization and this naive token-level optimization.
Extensive experiments on benchmarks from different DG tasks demonstrate that LFME is consistently beneficial to the baseline and can achieve comparable performance to existing arts.