1
White-Box Sensitivity Auditing with Steering Vectors
提出利用引导向量进行白盒敏感性审计的新方法,提升AI模型可解释性。
arXiv:2601.16398v2 Announce Type: replace-cross Abstract: Algorithmic audits are essential tools for examining systems for properties required by regu…