1
A Two-Phase Stability Study of LLM Judges and Bar Council Examiners on Thai Bar-Exam Free-Form Essays
挑战LLM法官稳定性假设:泰国律师考试中专家一致性并非单一上限,LLM评分存在变数。
arXiv:2605.25652v1 Announce Type: new Abstract: Free-form legal essay evaluation in NLP treats expert inter-rater stability as a single ceiling number…