牛哥精选 · 本月

1

🔓 开源项目 Hacker News AI 2026-05-25

Find where your AI coding tokens went: local TUI for Codex/Claude logs

用终端界面可视化追踪你的AI编码token消耗，Codex/Claude日志一目了然。

Article URL: https://github.com/peterxcli/ccost Comments URL: https://news.ycombinator.com/item?id=48259342 Points: 1 # Comments: 0

ai编码 token追踪 codex claude 终端界面

2

🌱 成长效率 Hacker News Ask 2026-05-25

Ask HN: I only use 30% of my Claude max x5 all model quota

当别人抱怨AI配额不够用时，这位开发者只用了30%，快来学习怎么高效用AI或换个思路。

I only use it for my ruby on rails app, I wonder why u all keep complaining about opus token usage, is it just means that I use AI/LLM wrong, any tips…

claude max quota使用 ruby on ra ai使用技巧 token配额优化

3

🔓 开源项目 Hacker News LLM 2026-05-24

Show HN: Memory for LLM apps that cuts input tokens up to 80% (avg 68%)

GitHub开源项目，让LLM应用拥有长期记忆，同时将输入token平均削减68%，大幅降低API成本。

Article URL: https://github.com/Tem-Degu/streetai-memory Comments URL: https://news.ycombinator.com/item?id=48249509 Points: 1 # Comments: 0

llm token优化内存管理开源成本节约

4

📄 文档手册 IT 之家 2026-05-23

国家数据局召开词元经济座谈会，阿里云、腾讯、月之暗面等参会

官方首次明确Token中文译名“词元”，国家数据局召集阿里云、腾讯等巨头共探智能时代数据价值新路径。

IT之家 5 月 23 日消息，据国家数据局消息，5 月 22 日，国家数据局党组书记、局长刘烈宏主持召开词元经济座谈会。会上，中国经济时报社、中国政法大学、中国人民大学、清华大学等单位的专家代表，阿里云、腾讯、月之暗面、海天瑞声、中国国际金融有限公司等企业代表，围绕“推动词元经济健康可持续发展…

国家数据局召开词元经济座谈会阿里云腾讯

5

💰 商业科技量子位 2026-05-23

“五类人AI替代不了，企业做第二名最稳妥” | 昆仑万维方汉@AIGC2026

Token用量差距达亿万倍，“小龙虾”创始人每月烧6000亿，揭示AI时代企业生存法则与五类不可替代人群。

“AI时代经验不再是护城河”

五类人替代不了企业做第二名最稳妥昆仑万维方汉

6

📝 深度技术 arXiv AI 2026-05-23

Meta-Soft: Leveraging Composable Meta-Tokens for Context-Preserving KV Cache Compression

用可组合的元标记压缩KV缓存，高效保留上下文信息，大模型推理再提速。

arXiv:2605.22337v1 Announce Type: new Abstract: The KV cache used in large language models has linearly growing time complexity, so LLMs face memory b…

kv cache压缩 meta-token 上下文保留大模型推理优化可组合元标记

7

📝 深度技术 Dev.to 2026-05-23

Web3 Apps You Can Build With Token Extensions

探索如何利用Token-2022扩展构建订阅制Web3应用：每个创作者每期独立代币，原子交易捆绑支付与非转移代币。

In our previous articles, we covered how Web3 tokens get their value and broke down the new standards in What is Token 2022 and why Solana built it . …

web3 solana token exte 订阅机制 token-2022

8

💰 商业科技 Hacker News AI 2026-05-23

I used $30,983 of AI tokens last month in Claude Code on $200/mo plan

月均使用30000美元token仅花200刀订阅费，Claude Code的token用量排行榜成了开发者新乐子。

Article URL: https://www.indiehackers.com/post/i-used-30-983-of-ai-tokens-last-month-in-claude-code-on-200-mo-plan-3337a369a6 Comments URL: https://ne…

claude cod ai tokens token成本开发者社区游戏化

9

🤖 AI·大模型 IT 之家 2026-05-22

智谱 GLM-5.1 高速版 AI 模型发布，跑出全球最快速度 400 tokens/s

智谱联合TileRT推出GLM-5.1高速版，推理速度高达400 tokens/s，并已在华为昇腾算力上实现生产级部署。

IT之家 5 月 22 日消息，智谱今日宣布面向部分企业客户提供 GLM-5.1 高速版 API“GLM-5.1-highspeed” 。该模型输出速度达到 400 tokens/s ，刷新当前全球大模型厂商 API 的速度上限。更重要的是，在过去，“快”往往意味着“小”，高速模型几乎总是轻量级…

智谱高速版模型发布跑出全球最快速度

10

📝 深度技术 Dev.to 2026-05-22

AI agents don't have a memory problem. They have an architecture problem.

AI Agent 的真正短板不是记忆，而是架构缺陷：近四分之一的 token 被结构性浪费，根源在于缺乏持久化上下文。

Every session, the LLM starts fresh. The user re-explains their role, their constraints, their preferences, what they were doing last time. Then the s…

ai agent 架构问题 token浪费上下文持久化 mem0

11

📝 深度技术 arXiv NLP 2026-05-22

Token-Level LLM Collaboration via FusionRoute

提出Token级LLM协作新方法FusionRoute，突破领域模型融合粒度，让小模型协同超越大模型。

arXiv:2601.05106v4 Announce Type: replace-cross Abstract: Large language models (LLMs) exhibit strengths across diverse domains. However, achieving st…

token-leve llm协作 fusionrout 模型融合小模型协同

12

⚡ 效率工具 Dev.to 2026-05-22

Stop guessing your AI API bill: a quick guide to token cost math

别被AI API账单吓到，一文教你精准计算token费用，优化成本从理解计费规则开始。

You can ship an LLM feature in an afternoon. Figuring out what it costs to run usually happens later, when the invoice shows up and someone asks why. …

token计费 api成本系统提示词 ai费用优化

13

🤖 AI·大模型 Hacker News LLM 2026-05-22

The Special Token `<Think>` Problem/Bug of Latest DeepSeek LLM

DeepSeek最新大模型中发现`<Think>`特殊token引发bug，引发对模型稳定性的关注。

Article URL: https://www.pixelstech.net/article/1779332017-the-special-token-%60%26lt-think%26gt-%60-problem-bug-of-latest-deepseek-llm Comments URL: …

deepseek 特殊token think toke llm bug 模型问题

14

💰 商业科技 TechCrunch 2026-05-21

Sam Altman makes ‘mic drop’ offer to every Y Combinator startup

Sam Altman现场发重磅福利：每个YC初创公司白送200万美元OpenAI token，换取小额股权。

Altman offered to have OpenAI invest in every single startup in this Y Combinator class: tokens for equity.

sam altman y combinat openai tokens 股权

15

🔓 开源项目 Dev.to 2026-05-21

Turn ~800M Free AI Tokens Into a Single OpenAI API with FreeLLMAPI

聚合14家AI免费层共8亿Token，自建代理统一成OpenAI接口，告别多SDK管理烦恼。

The Problem Nobody Talks About Every major AI lab now offers a free tier. Gemini, Groq, Mistral, Cerebras — they all give you a few million tokens a m…

freellmapi ai免费token聚 openai兼容ap 自托管代理效率工具

16

📝 深度技术 Hacker News LLM 2026-05-20

Customizing an LLM for Enterprise Software Engineering

Google内部定制LLM实战：万亿token数据集+中训策略，专攻企业软件工程场景。

Article URL: https://arxiv.org/abs/2605.16517 Comments URL: https://news.ycombinator.com/item?id=48202484 Points: 1 # Comments: 0

gemini llm定制软件工程 google内部万亿token数据集

17

🤖 AI·大模型 IT 之家 2026-05-20

谷歌每月处理超 3200 万亿 Token，同比增长 7 倍

谷歌AI处理量暴增：月处理3200万亿Token，同比增长7倍，展示AI大模型规模飞速扩张。

IT之家 5 月 20 日消息，在今日的 2026 谷歌 I/O 开发者大会上，谷歌 CEO 桑达尔 · 皮查伊开场谈到了谷歌在 AI 方面的进展。 2026 年 5 月，谷歌每月处理超 3200 万亿 Token ，同比增长了 7 倍。 IT之家从大会获悉，谷歌的 Gemini App 月度活…

谷歌每月处理万亿同比增长谷歌 ai大模型

18

⚡ 效率工具 Hacker News Show 2026-05-20

Show HN: PrismoDev – local CLI for finding token waste in Claude Code/Codex

本地零侵入扫描代码项目，揪出Claude/Codex中隐藏的20%+token浪费，无需联网不暴露数据。

I built PrismoDev after noticing my Claude Code and Codex sessions were getting expensive in ways that were hard to explain. After digging through loc…

cli工具 token优化代码审计本地工具开发者工具

19

🤖 AI·大模型 arXiv 计算机视觉 2026-05-20

LiteFrame: Efficient Vision Encoders Unlock Frame Scaling in Video LLMs

提出高效视觉编码器，解决Video LLM长视频中视觉token爆炸难题，突破帧扩展瓶颈。

arXiv:2605.17260v1 Announce Type: new Abstract: The fundamental challenge in scaling Video Large Language Models (Video LLMs) to long-form video lies …

视频大模型视觉编码器长视频理解 token压缩帧缩放

20

🤖 AI 工具 Dev.to 2026-05-20

Gemini 3.5 Flash Developer Guide

谷歌Gemini 3.5 Flash开发者指南，详解1M超长上下文与思考能力，助你快速迁移新模型特性

Gemini 3.5 Flash is generally available (GA) , stable, and ready for scaled production use. As our most intelligent Flash model, it delivers sustained…

ai模型开发者api 长上下文思考能力百万token

🐂 牛哥精选