Latent Cache Flow: Model-to-Model Communication Without Text
突破性方法让大模型直接交换内部状态缓存,告别文本中转的延迟与信息损耗。
arXiv:2605.22863v1 Announce Type: new Abstract: LLM agents today communicate via text, which incurs considerable latency and information loss due to t…
突破性方法让大模型直接交换内部状态缓存,告别文本中转的延迟与信息损耗。
arXiv:2605.22863v1 Announce Type: new Abstract: LLM agents today communicate via text, which incurs considerable latency and information loss due to t…
针对长时LLM Agent的上下文溢出问题,提出并行压缩方法,减少数十秒推理阻塞。
arXiv:2605.23296v1 Announce Type: new Abstract: Long-horizon LLM agents accumulate growing conversation histories that eventually exceed the model's c…
基于PeerJS的浏览器间视频会议应用,单命令启动、低延迟高性能P2P实时通信,轻松实现Web端视频通话。
WebChat A PeerJS-based browser-to-browser video conferencing app with low-latency, high-performance real-time communication. This project pushed me de…
实时语音AI的延迟生死线:从Go到Rust的迁移如何压缩毫秒级时延,守住250ms实时交互边界。
In building Vivik, an execution-grade telephony AI engine, we faced a brutal constraint: the human conversational loop. In psychoacoustics, a delay un…
Google Gemini 3.5 Flash 驱动自主 AI 代理,低延迟高质量,专为编码和复杂任务执行而生
Google launched Gemini 3.5 Flash, its most powerful coding and agentic AI model yet, at the company's annual developer conference. It is capable of au…
C语言实现的快速线程安全哈希映射,通过延迟排序优化性能,并与F14、Rust HashMap等主流方案同台竞技。
You can play with it here: https://godbolt.org/z/h4ffsWdq8 The main repo is here: https://github.com/RaphaelPrevost/ASKL Comments URL: https://news.yc…
IT之家 5 月 20 日消息,MCHOSE 迈从官方今日宣布,迈从 V9 Pro 游戏耳机新增粉色款, 到手价 179.55 元 。 京东 迈从(MCHOSE)V9 Pro 游戏耳机粉色 179.55 元 直达链接 迈从 V9 Pro 游戏耳机搭载 53mm 动圈单元,采用特调电竞级 FPS 音效…
IT之家 5 月 20 日消息,iQOO 今晚发布了全新真无线耳机 iQOO TWS 5i,新品定位“游戏好搭子”,主打长续航、低延迟与电竞音效体验,售价 119 元,即刻开售。 IT之家注意到,针对游戏玩家关注的延迟问题,iQOO TWS 5i 采用蓝牙 5.4 连接协议,并将全链路游戏延迟降低至…
低资源硬件感知NAS:仅需10次延迟探测就能高效搜索网络架构,降低对精确延迟模型的依赖
arXiv:2504.00663v2 Announce Type: replace Abstract: Existing hardware-aware NAS (HW-NAS) methods typically assume access to precise information circa …
每百毫秒延迟可能吃掉8%电商转化率,Vercel技术审计策略帮团队找到性能瓶颈
Every 100ms of latency can cost ecommerce applications up to 8% in sales conversion . At scale, this can cost millions in revenue. Complexity compound…
大模型公司15亿美元天价和解案遇阻,法官推迟批准只因作者嫌赔偿太少
Lawyers accused of rushing historic settlement to seize $320 million in fees.
软件随时间变得更流畅才是终极目标,Sublime Text 和 VS Code 做到了无延迟体验
Craig Mod has some pretty interesting thoughts on why software should be lightning fast . One example he makes is this: Sublime Text has — in my exper…
Vercel AI Gateway 正式 GA,一个 API 即接入数百种 AI 模型,透明定价无任何加价,还有内置可观测性和自动故障切换,调用多模型的最佳入口。
AI Gateway is now generally available, providing a single unified API to access hundreds of AI models with transparent pricing and built-in observabil…
无需微调模型,无需修改函数协议,AsyncFC 在纯执行层实现异步并发,让 LLM 解码与函数执行重叠,大幅降低端到端延迟。LLM 对未执行结果(symbolic futures)的推理能力被天然利用,开启模型-工具异步交互新范式。
arXiv:2605.15077v1 Announce Type: cross Abstract: Function calling, also known as tool use, is a core capability of modern LLM agents but is typically…