Autonomous LLM Agents & CTFs: A Second Look
再看自主LLM智能体在CTF挑战中的表现,更新发现与能力边界。
arXiv:2605.21497v1 Announce Type: cross Abstract: Large Language Model (LLM) agents are increasingly proposed to automate offensive security tasks, wi…
再看自主LLM智能体在CTF挑战中的表现,更新发现与能力边界。
arXiv:2605.21497v1 Announce Type: cross Abstract: Large Language Model (LLM) agents are increasingly proposed to automate offensive security tasks, wi…
用语言代理实现大模型自主微调的创新框架,省去人工干预,让微调过程自动化
arXiv:2603.01712v2 Announce Type: replace-cross Abstract: Fine-tuning large language models for vertical domains remains labor-intensive, requiring pr…
访问全球顶级学术期刊Nature,获取前沿科研论文,包括AI驱动的迷幻药研究等跨学科动态
Nature, Published online: 21 May 2026; doi:10.1038/d41586-026-01467-y Félix Schoeller’s team built a realistic artificial-intelligence chatbot to trai…
这篇 arXiv 论文探讨语言模型匿名化方法,为隐私保护提供新思路,适合 AI 安全研究者参考。
arXiv:2501.02407v3 Announce Type: replace-cross Abstract: Rapid advances in Natural Language Processing (NLP) have revolutionized many fields, includi…
手写数学也能自动批改?视觉大模型让AI教育再进一步,来自AIED 2026的实证研究。
arXiv:2605.19043v1 Announce Type: cross Abstract: Automated grading systems have enabled scalable assessment for many response types, but handwritten …
多模态大模型新突破:学会判断何时开口,对话交互更自然
arXiv:2505.14654v2 Announce Type: replace-cross Abstract: Chatbots via large language models (LLMs) generate fluent responses but often struggle with …
海量前沿学术论文免费获取,支持多学科预印本搜索与PDF下载,科研人必备的论文第一站
arXiv:2605.20016v1 Announce Type: cross Abstract: Short-form video poses new challenges to the quality assessment of user-generated content (UGC) due …
批判当前AI多元对齐仅依赖偏好聚合,提出必须主动暴露分歧以实现真正价值多元主义。
arXiv:2605.14912v1 Announce Type: new Abstract: Pluralistic alignment is typically operationalised as preference aggregation: producing responses that…
这篇论文系统审计了智能体框架的安全隐患,为构建可信AI系统提供关键方法论。
arXiv:2605.14271v2 Announce Type: replace Abstract: LLM agents increasingly run inside execution harnesses that dispatch tools, allocate resources, an…
提出CyberCorrect框架,将控制论闭环反馈引入LLM自我修正,解决现有方法缺乏系统性分析和收敛保证的问题。
arXiv:2605.17305v1 Announce Type: cross Abstract: Large language model (LLM) self-correction -- the ability to detect and fix errors in generated outp…
深入探讨LLM在定性研究中的机会、局限与实践考量,为学术研究者提供宝贵参考。
arXiv:2605.16538v1 Announce Type: cross Abstract: This paper examines the opportunities, limitations, and practical considerations associated with the…
用主动学习算法加速光子晶体设计,Optics Express新研究验证效率提升
arXiv:2601.16287v3 Announce Type: replace-cross Abstract: Active learning for photonic crystals explores the integration of analytic approximate Bayes…
自监督学习新方法优化稀疏矩阵重排序,大幅提升计算效率,已被DASFAA 2026接收。
arXiv:2605.17403v1 Announce Type: new Abstract: Rearranging the rows or columns of a sparse matrix using an appropriate ordering can significantly red…
提示注入是AI代理最致命的漏洞,研究表明现有防御手段可能永远无法彻底防范
arXiv:2605.17634v1 Announce Type: cross Abstract: Prompt injection is the most critical vulnerability in deployed AI agents. Despite recent progress, …
WWW 2026会议论文,专攻资源受限场景下的离线策略学习,理论与实验并重。
arXiv:2603.18702v4 Announce Type: replace Abstract: We study off-policy learning (OPL) in contextual bandits, which plays a key role in a wide range o…
提出记忆增强的评分标准改进系统,提升基于评分标准的强化学习效果。
arXiv:2605.18592v1 Announce Type: new Abstract: Rubric-based reward shaping is an effective method for fine-tuning LLMs via RL, where structured rubri…
跳出传统AI安全框架,探讨如何让人工智能积极促进人类繁荣,开启对齐研究新范式。
arXiv:2605.10310v2 Announce Type: replace Abstract: Existing alignment research is dominated by concerns about safety and preventing harm: safeguards,…
访问arXiv上这篇关于检索增强语言校准的最新论文,可系统了解语言线索表达置信度的校准框架与原理
arXiv:2605.19344v1 Announce Type: new Abstract: Linguistic cues such as "I believe" and "probably" offer an intuitive interface for communicating conf…
面向LLM推理的SLO感知旋转调度与内存管理技术,已被MLSys 2026录用,值得关注。
Article URL: https://supercomputing-system-ai-lab.github.io/projects/superinfer/ Comments URL: https://news.ycombinator.com/item?id=48188146 Points: 3…
提出基于LLM的论据挖掘系统,利用大模型自动识别文本中的论证结构
arXiv:2605.13793v2 Announce Type: replace Abstract: Arguments are a fundamental aspect of human reasoning, in which claims are supported, challenged, …