1
EpiCache: Episodic KV Cache Management for Long-Term Conversation on Resource-Constrained Environments
EpiCache提出场景式KV缓存管理,在资源受限设备上高效支持超长对话,内存优化有新招。
arXiv:2509.17396v4 Announce Type: replace Abstract: Modern large language models (LLMs) extend context lengths to millions of tokens, enabling coheren…