1
ScaleSweep: Accurate NVFP4 Post-Training Quantization of LLMs via Block Scale Initialization
提出基于块尺度初始化的NVFP4后训练量化方法,有效提升大语言模型低比特精度。
arXiv:2606.07618v1 Announce Type: new Abstract: NVFP4 is a recently introduced hardware-supported FP4 format that improves the fidelity of 4-bit quant…