1
Beyond Output Matching: Preserving Internal Geometry in NVFP4 LLM Distillatio
提出在NVFP4量化场景下通过保持内部几何结构进行LLM蒸馏,超越传统只匹配输出的方法,有望提升压缩模型效果。
arXiv:2606.05682v1 Announce Type: new Abstract: Demand for low-precision inference, including NVFP4-based approaches, has grown as large language mode…