1
Beyond Binary: Reframing GUI Critique as Continuous Semantic Alignment
将GUI批评从二元判断重构为连续语义对齐,提升智能体测试时扩展的排序能力
arXiv:2605.14311v1 Announce Type: cross Abstract: Test-Time Scaling (TTS), which samples multiple candidate actions and ranks them via a Critic Model,…