牛哥精选 · 所有

📋 全部 ☁️ 云服务 🤖 AI 平台 🔗 API 中转 🔐 安全/认证 💳 支付 📧 通讯 📊 数据分析 🖼 媒体处理 🌐 域名/DNS

📝 深度技术 arXiv AI 2026-05-20

Workspace-Bench 1.0: Benchmarking AI Agents on Workspace Tasks with Large-Scale File Dependencies

AI智能体工作空间任务的首个大规模基准，关注复杂文件依赖下的评估难题。

arXiv:2605.03596v4 Announce Type: replace Abstract: Workspace learning requires AI agents to identify, reason over, exploit, and update explicit and i…

ai agent 基准测试文件依赖工作空间任务评估

📅 日期

2026-05-20 2026-05-19

🐂 牛哥精选

Workspace-Bench 1.0: Benchmarking AI Agents on Workspace Tasks with Large-Scale File Dependencies

📅 日期