LEAF: A Living Benchmark for Event-Augmented Forecasting
针对预测任务,LEAF动态基准填补了多维事件评估空白,让大模型预测能力测试更贴近现实。
arXiv:2605.16358v1 Announce Type: new Abstract: Large Language Models (LLMs) are increasingly applied to forecasting. To evaluate this capability whil…