1
Fair outputs, Biased Internals: Causal Potency and Asymmetry of Latent Bias in LLMs for High-Stakes Decisions
LLM输出看似公平但内部仍藏偏见,研究发现不同群体间因果效力不对称,警惕高风决策风险。
arXiv:2605.15217v1 Announce Type: new Abstract: Instruction-tuned language models exhibit behavioural fairness in high-stakes decisions while retainin…