OptimalBrainCompression: AFrameworkfor AccuratePost-Training QuantizationandPruning
–Neural Information Processing Systems
Our frameworkstarts from thelayer-wise compression problem described above,bywhich theglobal compression task,defined either forpruning orquantization, is first split into layer-wise sub-problems, based on the layer behavior on the calibration data.
Neural Information Processing Systems
Feb-7-2026, 18:56:10 GMT
- Technology: