1
Predictive Prefetching for Retrieval-Augmented Generation
ICML 2026 录用,提出预测性预取策略加速检索增强生成,有效降低推理延迟。
arXiv:2605.17989v1 Announce Type: new Abstract: Retrieval-Augmented Generation (RAG) improves factual grounding in large language models but suffers f…