1
ASyMOB: Algebraic Symbolic Mathematical Operations Benchmark
新基准ASyMOB用3.5万道符号数学题区分LLM的推理与模式记忆,直击大模型数学能力评估痛点。
arXiv:2505.23851v2 Announce Type: replace-cross Abstract: Large language models (LLMs) are increasingly applied to symbolic mathematics, yet existing …