牛哥精选 · 三个月

1

📝 深度技术 arXiv AI 2026-07-14

Laguerre Geometry for Interpreting Large Language Models

Laguerre几何为理解大语言模型中的概念表示提供了精确的数学框架，将概念定义为区域而非单点或方向。

arXiv:2607.10578v1 Announce Type: new Abstract: Existing hypotheses represent a concept in an LLM as a single point, a linear direction, or a Gaussian…

laguerre几何大语言模型概念几何可解释性数学框架

2

🤖 AI·大模型 arXiv AI 2026-07-14

Exploring Agentic Workflows for Generating High Quality Math Visual Aids

论文探索利用AI代理工作流自动生成K-12数学教学图形，提升可视化质量与教育效果。

arXiv:2607.09839v1 Announce Type: new Abstract: Mathematical diagrams play a crucial role in K 12 education, both as problem components and as scaffol…

代理工作流数学可视化 ai教育自动生成质量控制

3

📝 深度技术 arXiv AI 2026-07-13

ProofCouncil: An LLM Agent for Solving Open Mathematical Problems

大模型智能体闯入数学证明领域，ProofCouncil挑战开放数学问题，在FirstProof基准中展现惊人推理能力。

arXiv:2607.09474v1 Announce Type: new Abstract: Large language models (LLMs) have shown increasing promise in solving open problems in mathematics. Ho…

proofcounc llm agent 开放数学问题数学证明大模型推理

4

🤖 AI·大模型 IT 之家 2026-07-12

仅用一小时，OpenAI GPT-5.6 Sol Ultra 证明了一个已有 50 年历史的数学猜想

OpenAI新模型GPT-5.6 Sol Ultra一小时破解五十年数学难题，多智能体并行协作惊艳学界

IT之家 7 月 12 日消息，OpenAI 于 7 月 10 日宣布，旗下 GPT-5.6 Sol Ultra 模型在不到 1 小时内，成功生成了“循环双覆盖猜想”（Cycle Double Cover Conjecture）的完整证明。这一猜想是图论领域悬而未决长达 50 多年的重要难题。 Op…

仅用一小时证明了一个已年历史的数学猜想 gpt-5.6

5

🤖 AI·大模型 Hacker News 最佳 2026-07-11

GPT-5.6 Sol Ultra produces proof of the Cycle Double Cover Conjecture [pdf]

AI模型GPT-5.6 Sol Ultra首次给出图论Cycle Double Cover猜想的完整证明，数学界迎来机器推理里程碑。

https://x.com/__eknight__/status/2075643450196971805 , https://xcancel.com/__eknight__/status/2075643450196971805 Prompt: https://cdn.openai.com/pdf/0…

gpt-5.6 cycle doub 数学证明人工智能图论

6

🤖 AI·大模型量子位 2026-07-11

GPT-5.6一小时解开50年数学猜想，700词Prompt驾驭64个子Agent

GPT-5.6一小时攻破50年数学难题，64个子Agent协同的700词Prompt引发AI推理新突破

神话级大模型驾驭宝典

一小时解开年数学猜想驾驭个子 gpt-5.6

7

📝 深度技术 arXiv NLP 2026-07-10

IMProofBench: Benchmarking AI on Research-Level Mathematical Proof Generation

60位作者联合发布，首个评估AI在研究级数学证明生成能力的基准测试，挑战大模型推理极限。

arXiv:2509.26076v2 Announce Type: replace Abstract: As the mathematical capabilities of large language models (LLMs) improve, it becomes increasingly …

improofben 数学证明大语言模型基准测试 ai数学能力

8

📝 深度技术 arXiv 机器学习 2026-07-10

PhasorFlow: A Python Library for Unit Circle Based Computing

面向单位圆计算场景的轻量级Python新库，PhasorFlow提供高效数学抽象，适合信号与复数领域研究。

arXiv:2603.15886v3 Announce Type: replace Abstract: We present PhasorFlow, an open-source Python library for computing on the $S^1$ unit circle. Input…

phasorflow python库单位圆复数计算数学软件

9

📝 深度技术 Dev.to 2026-07-09

Goodhart's Law Is a Math Problem - And We Can Prove It With 10 Lines of Python

用10行Python代码数学证明古德哈特定律，揭示指标如何扭曲真实行为。

Your dashboard is lying to you - and we can prove it mathematically. The Pattern You've Seen Before You've lived this story. Every engineer has. Quart…

古德哈特定律 python 数学证明指标扭曲代码示例

10

🤖 AI·大模型 Dev.to 2026-07-08

CPU vs GPU: Why Large Language Models Need GPUs — What Really Happens After You Press Enter?

生动比喻揭秘为何GPU是大模型的大脑，从按下回车到万亿运算的探索之旅。

The moment you press Enter, billions of mathematical operations begin. Let's follow that journey. Every day, millions of people ask ChatGPT, Gemini, C…

cpu gpu 大语言模型数学运算并行计算

11

🤖 AI·大模型 arXiv AI 2026-07-07

MechMath Agent Team: LLM Driven Agents for Mathematical Research

LLM代理团队协作攻克数学难题，多智能体系统在科研领域的新突破。

arXiv:2607.04394v1 Announce Type: new Abstract: AI reasoning has become a central focus in contemporary artificial intelligence, largely driven by the…

llm 智能体数学研究团队协作 arxiv论文

12

🤖 AI·大模型量子位 2026-07-07

征程赶超｜WAIC 2026理论突破：以数理双向赋能为钥，开启AI范式革新新征程

WAIC 2026揭示AI从参数内卷转向数理驱动，三大创新主线开启精细化发展新范式

征程赶超理论突破以数理双向赋能为钥开启

13

🤖 AI·大模型 IT 之家 2026-07-06

面向数学形式化证明：Mistral AI 发布 Leanstral 1.5 低使用成本开源模型

数学证明成本骤降：Mistral AI开源Leanstral 1.5，解决同类问题仅需4美元，远超竞品。

IT之家 7 月 6 日消息，欧洲人工智能企业 Mistral AI 当地时间本月 2 日宣布推出面向数学形式化证明程序语言 Lean 4 的 Leanstral 1.5 模型。该模型总共拥有 119B 参数，激活 6B 参数，以 Apache-2.0 许可开源。 Mistral AI 表示，L…

面向数学形式化证明发布低使用成本开源模型

14

🤖 AI 工具 Hacker News AI 2026-07-06

Show HN: Social and context-aware AI platform to do math

一个结合社交与上下文的AI数学平台，帮你协作解题、理解复杂公式

Hi HN, This is ProofTree, and in TLDR: it is a platform where you can chat with an AI to do math, the way you already do, with context-awareness and k…

数学解题社交协作上下文感知 ai 辅助推理验证

15

📝 深度技术 arXiv AI 2026-07-02

A Category Theory Account of AI Identity

用范畴论的抽象语言，重新定义AI身份识别，为人工智能自我认知提供数学框架。

arXiv:2607.00220v1 Announce Type: cross Abstract: Artificial intelligence (AI) systems are routinely modified after deployment through retraining and …

范畴论 ai身份识别数学基础人工智能 andrea fer

16

📝 深度技术 arXiv NLP 2026-06-30

Categorizing Mathematical Concepts with LLM Voting Ensembles in Mathswitch

用LLM投票集成方法为数学概念自动分类，这篇已被CICM录用的论文或能突破传统分类瓶颈。

arXiv:2606.28815v1 Announce Type: cross Abstract: Mathswitch is an open-source project that imports mathematical concept records from sources such as …

llm投票集成数学概念分类 mathswitch cicm 2026 文本分类

17

📝 深度技术 arXiv 机器学习 2026-06-30

How AI settled the complexity of the oldest SGD algorithm

AI最终破解了最古老随机梯度下降算法的复杂度难题，数学理论迎来新突破。

arXiv:2606.29593v1 Announce Type: new Abstract: In 1937, Stefan Kaczmarz proposed a simple algorithm for solving systems of linear equations. This alg…

sgd算法复杂度 ai证明优化理论机器学习数学

18

📝 深度技术 arXiv AI 2026-06-25

Cliff Tokens: Identifying Single-Token Failure Triggers in LLM Mathematical Reasoning

本文发现LLM推理中的“悬崖词”——单个token即可导致数学运算失败，揭示模型脆弱性根源。

arXiv:2606.25524v1 Announce Type: new Abstract: Large language models (LLMs) reach high accuracy in mathematical reasoning, but individual traces on t…

llm 数学推理单token失败触发器推理错误

19

📝 深度技术 arXiv 机器学习 2026-06-25

Structured Approximations of Measures

探索测度近似的新结构化方法，为机器学习和概率论提供理论工具

arXiv:2310.09149v3 Announce Type: replace-cross Abstract: We study the approximation of probability measures in the Wasserstein-$p$ distance by struct…

测度近似结构化近似数学理论机器学习理论

20

🤖 AI·大模型 arXiv AI 2026-06-24

Benchmarking LLMs' Mathematical Reasoning with Unseen Random Variables Questions

新论文用未知随机变量问题测试大模型数学推理能力，揭示模型真实推理水平

arXiv:2501.11790v5 Announce Type: replace-cross Abstract: Recent studies have raised significant concerns regarding the reliability of current mathema…

数学推理大模型基准测试随机变量

🐂 牛哥精选