1
Can Global XAI Methods Reveal Injected Behaviours in LLMs? SHAP vs Rule Extraction vs RuleSHAP
KDD 2026新研究,对比SHAP、规则提取与RuleSHAP三种全局可解释方法,检测大模型中注入的误导信息行为。
arXiv:2505.11189v3 Announce Type: replace-cross Abstract: Large language models (LLMs) can amplify misinformation, undermining societal goals such as …