1
LLM INQUISITOR: Evaluating how AI models handle long, realistic tasks
开源项目LLM INQUISITOR提供真实场景下的LLM行为评估框架,专注长任务和实用性,而非基准测试。
Article URL: https://github.com/AssimilatedHuman/LLM-Inquisitor Comments URL: https://news.ycombinator.com/item?id=48207330 Points: 1 # Comments: 0