1
PoisonForge: Task-Level Targeted Poisoning Benchmark for Instruction-Tuned LLMs
指令微调LLM面临的任务级定向投毒威胁,首个系统性基准PoisonForge发布,助力模型安全评估。
arXiv:2605.23168v1 Announce Type: cross Abstract: When practitioners fine-tune LLMs on unvetted datasets, an adversary can exploit the data supply cha…