One-Way Policy Optimization for Self-Evolving LLMs
提出单向策略优化方法,让大模型在无外反馈下自我进化,提升推理与对齐能力。
arXiv:2605.22156v1 Announce Type: cross Abstract: Reinforcement Learning with Verifiable Rewards (RLVR) has become a promising paradigm for scaling re…
Show HN: Autodidact – Self-evolving local-first AI agent
本地优先AI代理自动处理Ollama依赖,安装模型启动一气呵成,适合不想折腾环境的开发者。
pip install autodidact && autodidact init Comments URL: https://news.ycombinator.com/item?id=48194739 Points: 4 # Comments: 0
D$^2$Evo: Dual Difficulty-Aware Self-Evolution for Data-Efficient Reinforcement Learning
提出双难度感知自进化方法,解决强化学习训练数据稀缺与动态难度转移的挑战。
arXiv:2605.17037v1 Announce Type: new Abstract: Reinforcement learning (RL) has demonstrated potential for enhancing reasoning in large language model…
TopoEvo: A Topology-Aware Self-Evolving Multi-Agent Framework for Root Cause Analysis in Microservices
拓扑感知多智能体框架自进化解决微服务根因分析,应对噪声、级联故障和拓扑漂移三大挑战
arXiv:2605.15611v1 Announce Type: new Abstract: Root cause analysis (RCA) in microservices is challenging due to (i) noisy and heterogeneous multimoda…
AgenticEval: Toward Agentic and Self-Evolving Safety Evaluation of Large Language Models
提出动态自进化安全评估框架,解决大模型静态基准无法应对AI风险演变的问题。
arXiv:2509.26100v2 Announce Type: replace Abstract: The rapid integration of Large Language Models (LLMs) into high-stakes domains necessitates reliab…