1
Scales++: Compute Efficient Evaluation Subset Selection with Cognitive Scales Embeddings
用认知嵌入高效筛选评估子集,大幅降低大模型评测成本,保持预测准确性。
arXiv:2510.26384v2 Announce Type: replace-cross Abstract: The prohibitive cost of evaluating large language models (LLMs) on comprehensive benchmarks …