1
Alignment Drift in Long-Term Human-LLM Interaction: A Mechanism-Oriented Framework
揭示长期人机交互中LLM对齐漂移机制,提出全新框架防范AI失控。
arXiv:2605.16516v1 Announce Type: cross Abstract: Long-term interaction with LLM-based systems may produce alignment drift: a gradual process in which…