1
Unsteady Metrics and Benchmarking Cultures of AI Model Builders
揭秘AI模型评估如何从学术评审滑向企业宣传,谁在定义“最先进”?
arXiv:2605.14164v1 Announce Type: new Abstract: The primary way to establish and compare competencies in foundation and generative AI models has shift…