牛哥精选 · 所有

1

📝 深度技术 arXiv AI 2026-07-08

Harnessing Code Agents for Automatic Software Verification

代码代理自动完成软件验证，在Iris分离逻辑上成功证明超4000个引理，效率惊人！

arXiv:2607.06341v1 Announce Type: cross Abstract: Formal verification offers the strongest guarantee of software correctness, but it does not scale: t…

代码代理软件验证自动定理证明 iris 分离逻辑

2

📝 深度技术 Hacker News Show 2026-06-26

Show HN: Formally Verified FPSan

用Lean定理证明器对Triton的FPSan进行形式化验证，确保浮点数运算的数学正确性

Article URL: https://github.com/bollu/fpsan-verification Comments URL: https://news.ycombinator.com/item?id=48677601 Points: 2 # Comments: 0

形式化验证 fpsan lean triton 定理证明

3

📝 深度技术 arXiv AI 2026-06-17

IsabeLLM: Automated Theorem Proving Applied to Formally Verifying Consensus

IsabeLLM将大语言模型与自动化定理证明结合，为共识算法提供形式化验证新方法。

arXiv:2606.18098v1 Announce Type: new Abstract: Advances in Artificial Intelligence (AI) have led AI for Theorem Proving to become a promising means o…

自动化定理证明形式化验证共识算法 llm isabellm

4

🤖 AI·大模型 arXiv AI 2026-06-16

SorryDB: Can AI Provers Complete Real-World Lean Theorems?

来自78个真实GitHub项目的动态基准，挑战AI证明器解决实际数学定理的能力

arXiv:2603.02668v2 Announce Type: replace Abstract: We present SorryDB, a dynamically-updating benchmark of open Lean tasks drawn from 78 real world f…

ai大模型定理证明 lean 形式化验证基准测试

5

📝 深度技术 arXiv AI 2026-06-09

TheoremBench: Evaluating LLMs on Theorem Proving in Formal Mathematics

度量LLM数学定理证明能力的全新基准，填补自动形式化证明评估空白

arXiv:2606.09450v1 Announce Type: new Abstract: LLMs have recently achieved strong results on formal proving benchmarks. However, existing evaluations…

llm 定理证明形式数学基准评估 ai推理

6

📝 深度技术 arXiv AI 2026-06-02

Formally Solving Answer-Construction Problems in Lean

聚焦Lean形式化证明新挑战：提出解决数学竞赛“答案构造”问题的形式化方法，结合LLM进展，拓展自动推理边界。

arXiv:2505.18492v5 Announce Type: replace Abstract: Mathematical competition problems fall into two broad types: theorem proving, which asks for a pro…

lean 形式化证明数学竞赛答案构造定理证明

7

📝 深度技术 arXiv 机器学习 2026-06-02

A Theoretical Framework for Self-Play Theorem Proving Algorithms

自对弈算法在定理证明领域有了严谨的理论根基，为AI数学推理提供新视角。

arXiv:2606.01861v1 Announce Type: new Abstract: Self-play, a type of training algorithm that enables a model to self-improve, has recently shown promi…

自对弈定理证明理论框架强化学习形式化数学

8

📝 深度技术 arXiv AI 2026-06-01

Distilling LLM Feedback for Lean Theorem Proving

让大语言模型的反馈“蒸馏”成更智能的Lean定理证明能力，突破自动推理新边界。

arXiv:2605.30861v1 Announce Type: new Abstract: Post-training for reasoning models typically combines supervised fine-tuning with reinforcement learni…

llm 定理证明 lean 蒸馏形式化验证

9

📝 深度技术 arXiv AI 2026-05-23

What are the Right Symmetries for Formal Theorem Proving?

探讨形式定理证明中对称性的选择，为自动推理提供新视角。

arXiv:2605.22257v1 Announce Type: cross Abstract: Formal theorem provers based on large language models (LLMs) are highly sensitive to superficial var…

形式定理证明对称性自动推理机器学习

10

📝 深度技术 arXiv NLP 2026-05-22

ImProver: Agent-Based Automated Proof Optimization

基于智能体的自动证明优化框架，提升数学证明的简洁性与可读性，ICLR 2025 顶会论文。

arXiv:2410.04753v2 Announce Type: replace-cross Abstract: Large language models (LLMs) have been used to generate formal proofs of mathematical theore…

ai代理自动证明证明优化定理证明 iclr2025

11

🤖 AI·大模型美团技术团队 2026-05-20

LongCat-Flash-Prover：AI 攻克数学定理证明，不仅要“算得对”，更要“证得严”

美团开源LongCat-Flash-Prover，AI数学定理证明新SOTA，MiniF2F通过率97.1%

在常规的数学解题中，模型只需要“答对最终数值”即可，但数学定理证明不同，它要求极度严苛的逻辑链条，任何一句自然语言的模棱两可，都可能导致整个证明的崩塌。那么，如何让 AI 从“猜答案”走向“严谨证明”，成为复杂推理具有挑战的课题。为了解答这个问题，我们开源了专门用于数学形式化与定理证明的模型 —— …

数学定理证明形式化推理开源模型美团 sota

12

📝 深度技术 arXiv AI 2026-05-20

LeanSearch v2: Global Premise Retrieval for Lean 4 Theorem Proving

LeanSearch v2提出全局前提检索，一次性找出Lean 4定理所需全部引理，突破现有单步或语义匹配局限。

arXiv:2605.13137v2 Announce Type: replace-cross Abstract: Proving theorems in Lean 4 often requires identifying a scattered set of library lemmas whos…

lean 4 定理证明前提检索引理选择全局搜索

13

📝 深度技术 arXiv 机器学习 2026-05-20

Lean Meets Theoretical Computer Science: Scalable Synthesis of Theorem Proving Challenges in Formal-Informal Pairs

结合Lean与理论计算机科学，可规模生成形式-非形式配对的定理证明挑战，助力AI数学推理研究。

arXiv:2508.15878v2 Announce Type: replace-cross Abstract: Formal theorem proving (FTP) has emerged as a critical foundation for evaluating the reasoni…

定理证明 lean 形式化验证理论计算机科学 ai数学推理

14

📝 深度技术 arXiv AI 2026-05-19

A Minimal Agent for Automated Theorem Proving

提出最小化agent基线，系统对比AI定理证明器架构，核心特性包括迭代改进、库搜索与上下文管理。

arXiv:2602.24273v3 Announce Type: replace Abstract: We propose a minimal agentic baseline that enables systematic comparison across different AI-based…

自动定理证明 ai agent 迭代证明改进库搜索上下文管理

🐂 牛哥精选