牛哥精选 · 所有

1

📝 深度技术 arXiv AI 2026-07-14

Trojan Horse Prompting: Jailbreaking Conversational Multimodal Models by Forging Assistant Message

对话多模态模型安全新漏洞：通过伪造助手消息实现越狱攻击，揭示未被探索的攻击面。

arXiv:2507.04673v2 Announce Type: replace Abstract: The rise of conversational interfaces has greatly enhanced LLM usability by leveraging dialogue hi…

llm安全越狱攻击对话模型多模态提示注入

2

🤖 AI·大模型 Dev.to 2026-07-09

GitLost Is a Preview of Every Agentic Workflow Breach You'll See This Year

仅凭公开issue和隐藏指令就能让AI代理泄露私库数据，警示代理工作流安全新风险。

Hook A public GitHub issue, a hidden instruction, and one word changed in a prompt was enough to get an AI agent to leak private repo data. No stolen …

gitlost ai代理安全漏洞提示注入数据泄露

3

🤖 AI·大模型 Hacker News AI 2026-07-08

GitHub AI agent leaks private repos when asked nicely

GitHub AI agent竟因“礼貌提问”就泄露私有仓库，安全漏洞令人震惊。

Article URL: https://www.theregister.com/security/2026/07/07/github-ai-agent-leaks-private-repos-when-asked-nicely/5267924 Comments URL: https://news.…

github ai agent 安全漏洞私有仓库泄露提示注入

4

📝 深度技术 Hacker News AI 2026-07-08

GitLost: We Tricked GitHub's AI Agent into Leaking Private Repos

GitHub AI Agent被诱骗泄露私有仓库，一场提示注入攻击引发身份管理新思考

Article URL: https://noma.security/blog/gitlost-how-we-tricked-githubs-ai-agent-into-leaking-private-repos/ Comments URL: https://news.ycombinator.com…

github ai agent 安全漏洞私有仓库泄露提示注入

5

📝 深度技术 arXiv AI 2026-07-07

DualView: Preventing Indirect Prompt Injection in Personal AI Agents

提出DualView防御机制，有效阻止针对个人AI代理的间接提示注入攻击，开创性安全方案。

arXiv:2607.03821v1 Announce Type: cross Abstract: Personal AI agents that run on the user's local machine, such as OpenClaw, automate daily tasks incl…

dualview 间接提示注入 ai安全个人ai代理防御机制

6

🤖 AI·大模型 Hacker News Show 2026-06-25

Show HN: Lelu – gate OpenAI agent actions on confidence and prompt injection

基于置信度和提示注入检测，为OpenAI智能体行动加一道安全门，60秒本地部署。

Article URL: https://github.com/Lelu-ai/lelu Comments URL: https://news.ycombinator.com/item?id=48664025 Points: 4 # Comments: 0

lelu ai安全提示注入代理认证 openai

7

🤖 AI·大模型 Hacker News AI 2026-06-24

Cisco AI Defense Skill Scanner

一款由思科开源的AI技能安全扫描器，5分钟快速检测提示注入与数据泄露风险。

Article URL: https://github.com/cisco-ai-defense/skill-scanner Comments URL: https://news.ycombinator.com/item?id=48656076 Points: 2 # Comments: 0

cisco ai defense skill scan 安全扫描提示注入

8

🤖 AI 工具 Hacker News AI 2026-06-21

Show HN: Cloak – let AI agents use your API keys without ever seeing them

Cloak 帮你安全地把 API 密钥交给 AI 代理，模型和日志永远看不到密钥，防止提示注入泄露。

Article URL: https://github.com/cloakward/cloak Comments URL: https://news.ycombinator.com/item?id=48618904 Points: 2 # Comments: 0

api密钥管理 ai安全提示注入防护隐私保护开发者工具

9

🤖 AI 工具 arXiv NLP 2026-06-19

A Layered Security Framework Against Prompt Injection in RAG-Based Chatbots

了解RAG聊天机器人防注入攻击的分层安全框架，开放获取的学术前沿。

arXiv:2606.19660v1 Announce Type: cross Abstract: Prompt injection is ranked as the most critical vulnerability in large language model (LLM) deployme…

提示注入 rag 安全框架聊天机器人分层防御

10

📝 深度技术 arXiv AI 2026-06-15

From Shield to Target: Denial-of-Service Attacks on LLM-Based Agent Guardrails

LLM防护盾反成攻击靶心：揭示基于大模型的agent安全护栏存在拒绝服务新漏洞

arXiv:2606.14517v1 Announce Type: cross Abstract: LLM-based guardrails have emerged as a highly effective defense against prompt injection and jailbre…

llm安全拒绝服务攻击 guardrail漏提示注入防护 agent安全

11

🤖 AI·大模型 Hacker News LLM 2026-06-12

Chaining LLM and web bugs to Admin

从提示注入到管理员权限：揭示如何利用LLM不安全输出处理与Web漏洞组合，实现完整攻击链

Article URL: https://blog.quarkslab.com/from-prompt-to-pwned-chaining-llm-and-web-bugs-to-admin.html Comments URL: https://news.ycombinator.com/item?i…

llm安全 prompt注入 web漏洞链式攻击权限提升

12

🔓 开源项目 Hacker News Show 2026-06-12

Show HN: AVP – an agent can't leak a secret it never had

10秒部署的本地代理，防止AI代理泄漏你的凭证，连它自己都记不住秘密。

A process can't leak a secret it never had. Shai-hulud, prompt-injection - you name it. They cannot steal what your agent (or an process) don't have. …

安全代理凭证保护提示注入防护本地代理开源工具

13

📝 深度技术 arXiv AI 2026-06-12

PI-Hunter: Automated Red-Teaming for Exposing and Localizing Prompt Injections

论文提出PI-Hunter，自动发现并精准定位大模型提示注入漏洞，为AI安全红队测试提供新方案。

arXiv:2606.12737v1 Announce Type: cross Abstract: Large Language Models (LLMs) are rapidly evolving into agentic systems that interact with external t…

提示注入红队测试 ai安全自动化漏洞检测大模型安全

14

🤖 AI·大模型 arXiv AI 2026-06-11

Learning to Inject: Automated Prompt Injection via Reinforcement Learning

利用强化学习自动化实现提示注入攻击，性能超越人工红队测试

arXiv:2602.05746v2 Announce Type: replace-cross Abstract: Prompt injection is a critical vulnerability in LLM agents, yet the strongest methods still …

提示注入强化学习大模型安全自动化攻击红队测试

15

🤖 AI·大模型 Hacker News AI 2026-06-10

A €0.01 bank transfer could compromise a banking AI agent

从0.01欧元转账到定制钓鱼攻击，揭露银行AI助手的间接提示注入漏洞，安全测试案例值得关注。

Article URL: https://blue41.com/blog/how-we-helped-bunq-secure-their-financial-ai-assistant/ Comments URL: https://news.ycombinator.com/item?id=484761…

银行ai 提示注入钓鱼攻击安全测试金融科技

16

📝 深度技术 arXiv AI 2026-06-10

GitInject: Real-World Prompt Injection Attacks in AI-Powered CI/CD Pipelines

揭秘AI驱动CI/CD流水线中提示注入攻击的真实案例与风险分析。

arXiv:2606.09935v1 Announce Type: cross Abstract: AI-powered agents are increasingly embedded in continuous integration and continuous delivery/deploy…

提示注入 ai安全 ci/cd 代码审查安全攻击

17

📝 深度技术 arXiv AI 2026-06-10

Assessing Automated Prompt Injection Attacks in Agentic Environments

针对LLM代理的自动化提示注入攻击在真实场景中未被充分研究，本文提供了系统性的实证评估。

arXiv:2606.10525v1 Announce Type: cross Abstract: Indirect prompt injection poses a critical threat to LLM agents that interact with untrusted externa…

提示注入 llm安全自动化攻击代理环境实证评估

18

📝 深度技术 arXiv 机器学习 2026-06-09

The Injection Paradox: Brand-Level Suppression in Safety-Trained LLM Recommendations via RAG Context Injection

一篇揭示RAG-LM安全训练中“注入悖论”的论文：注入逆向抑制品牌推荐，带来新的安全思考。

arXiv:2606.09204v1 Announce Type: new Abstract: We present a reproducible failure mode of safety training in RAG-based LLM recommendation -- the Injec…

llm安全 rag 提示注入品牌抑制对抗样本

19

🤖 AI·大模型 arXiv AI 2026-06-09

Automated Framework to Evaluate and Harden LLM System Instructions against Encoding Attacks

提出自动化框架，系统评估并加固LLM系统指令以抵御编码攻击，为AI安全提供新工具。

arXiv:2604.01039v2 Announce Type: replace-cross Abstract: System Instructions in Large Language Models (LLMs) are commonly used to enforce safety poli…

llm安全编码攻击系统指令加固自动化评估安全框架

20