OpenSkillEval: Automatically Auditing the Open Skill Ecosystem for LLM Agents
新论文提出自动审计框架,评估LLM智能体的开放技能生态系统,助力安全与能力验证。
arXiv:2605.23657v1 Announce Type: new Abstract: Skills, i.e., structured workflow instructions distilled for large language models (LLMs), are becomin…