牛哥精选 · 三个月

1

🤖 AI·大模型 arXiv NLP 2026-07-14

UMoE:Unlocking Every Expert in Domain-Specific Training

新论文提出UMoE方法，在领域特定训练中激活每个专家，突破传统MoE路径选择限制，提升模型适应性与效率。

arXiv:2607.11444v1 Announce Type: new Abstract: Mixture-of-Experts (MoE) models scale capacity without proportional compute cost and have become a key…

umoe mixture-of 领域特定训练专家激活模型优化

2

🤖 AI·大模型 IT 之家 2026-07-13

OpenAI 优化 GPT-5.6 Sol 并放宽使用限额，Anthropic 延长 Fable 5 推广期

OpenAI优化GPT-5.6性能并提升订阅额度10%，Anthropic延长Fable 5推广，AI巨头竞争再升级

IT之家 7 月 13 日消息，在上个月率先小范围预览 GPT-5.6 系列模型后，OpenAI 于上周正式向所有用户开放了该系列全部三款模型，分别为 GPT-5.6 Sol、GPT-5.6 Terra 和 GPT-5.6 Luna。模型上线后，OpenAI 也根据首批用户的反馈进行了多项调整。 O…

优化并放宽使用限延长推广期 gpt-5.6

3

🤖 AI·大模型 arXiv AI 2026-07-07

R$^2$PO: Decoupling Rollout and Inference Policies for LLM Reasoning

提出R²PO方法，通过解耦Rollout与推理策略，显著提升LLM复杂推理任务中的效率与准确度。

arXiv:2601.11960v3 Announce Type: replace-cross Abstract: Existing reinforcement learning methods for LLM reasoning implicitly assume that the policy …

llm推理 rollout策略推理策略强化学习模型优化

4

🤖 AI·大模型 arXiv AI 2026-07-03

Optimizing RAG Rerankers with LLM Feedback via Reinforcement Learning

利用强化学习与LLM反馈优化RAG重排序器，解决静态标注与生成脱节的关键难题。

arXiv:2604.02091v2 Announce Type: replace-cross Abstract: Rerankers play a pivotal role in refining retrieval results for Retrieval-Augmented Generati…

rag 重排序强化学习大语言模型检索增强生成

5

🤖 AI·大模型 TechCrunch 2026-07-01

Google introduces a faster, cheaper image generator with Nano Banana 2 Lite

谷歌发布更便宜更快的图像生成模型Nano Banana 2 Lite，专为高吞吐快速任务优化，现已通过AI Studio和Gemini API可用。

Google is updating its image generator to make it faster and cheaper, making it a more useful tool for creators looking to make AI content.

google nano banan 图像生成模型优化 ai studio

6

🤖 AI·大模型 arXiv AI 2026-07-01

HippoSpark: An On-Demand Experience System for LLM Reasoning

LLM推理新突破：HippoSpark按需经验系统，动态提升复杂推理准确率与效率。

arXiv:2606.29929v1 Announce Type: new Abstract: Distilling historical trajectories into reusable experience to enhance future problem-solving has beco…

llm推理按需系统大模型优化推理加速分布式经验

7

🤖 AI·大模型 IT 之家 2026-07-01

消息称 OpenAI 通过系统底层优化，将 AI 模型推理成本减半

OpenAI 通过系统底层优化让 AI 模型推理成本骤降一半，技术突破直接利好实际应用。

IT之家 6 月 30 日消息，据 The Information 报道，OpenAI 工程师在内部透露，公司已通过一系列全新的系统底层优化，成功将 AI 模型的推理（运行）成本降低了 50% 以上。据IT之家了解，推理成本指的是模型在实际运行并响应用户请求时所消耗的计算资源。此次优化主要得益于…

消息称通过系统底层优化模型推理成本减半

8

🤖 AI·大模型 arXiv AI 2026-06-29

Enhancing Numerical Prediction in LLMs via Smooth MMD Alignment

论文提出平滑MMD对齐方法，显著提升大语言模型数值预测准确性，为LLM数理能力优化提供新思路。

arXiv:2606.27731v1 Announce Type: cross Abstract: Despite their strong general capabilities, large language models (LLMs) often remain unreliable when…

llm 数值预测 mmd对齐平滑对齐模型优化

9

📝 深度技术 arXiv 机器学习 2026-06-25

Don't Go Breaking My LLM: The Impact of Pruning Attention Layers on Explanation Faithfulness and Confidence Calibration

研究修剪注意力层如何影响LLM的解释忠实性与置信度校准，揭示模型优化新视角。

arXiv:2606.24970v1 Announce Type: new Abstract: Pruning Large Language Models (LLMs) reduces memory and inference costs by removing parts of the netwo…

llm 注意力层剪枝解释忠实性置信度校准模型优化

10

🤖 AI·大模型 IT 之家 2026-06-25

消息称谷歌 Gemini 3.5 Pro 发布时间推迟至 7 月，旨在优化模型性能

谷歌 Gemini 3.5 Pro 延至7月发布，聚焦优化性能，直面Anthropic与OpenAI在代码领域竞争。

IT之家 6 月 25 日消息，据 Business Insider 获悉，谷歌下一代前沿人工智能模型的发布时间已推迟至 7 月。知情人士透露，该公司此前称计划在 6 月推出全新的 Gemini 3.5 Pro 模型，如今目标调整为 7 月上线，目的是留出更多时间收集早期测试用户的反馈，并对模型进…

消息称谷歌发布时间推迟旨在优化模型性能谷歌

11

📝 深度技术 arXiv 机器学习 2026-06-24

EnerInfer: Energy-Aware On-Device LLM Inference

端侧大模型推理的能耗难题有了新解法，EnerInfer提出能源感知优化框架，兼顾性能与功耗，适合部署在手机等边缘设备。

arXiv:2606.23001v1 Announce Type: cross Abstract: On-device LLM inference is increasingly attractive for privacy-preserving, reliable, and cost-effect…

能源感知端侧推理大模型优化 llm 能耗优化

12

🤖 AI·大模型 Hacker News AI 2026-06-20

Huawei chips refine DeepSeek model in major leap for China's AI self-reliance

华为芯片成功优化DeepSeek模型，中国AI自主可控再进一步

Article URL: https://www.scmp.com/tech/article/3356117/huawei-chips-refine-deepseek-model-major-leap-chinas-ai-self-reliance Comments URL: https://new…

华为 deepseek ai芯片自主可控模型优化

13

🤖 AI·大模型 arXiv 机器学习 2026-06-17

Learning to Refine Hidden States for Reliable LLM Reasoning

从隐状态层面精炼LLM推理过程，开源代码助力提升模型可靠性。

arXiv:2606.17524v1 Announce Type: new Abstract: Large language models show strong reasoning ability, but their internal reasoning process can remain u…

llm推理隐藏状态细化可靠性神经网络模型优化

14

🤖 AI·大模型 arXiv 计算机视觉 2026-06-16

Context-Aware RL for Agentic and Multimodal LLMs

通过上下文感知强化学习，让大模型在长上下文中精准定位关键证据，提升推理与多模态能力。

arXiv:2606.17053v1 Announce Type: cross Abstract: Large language models (LLMs) often fail when answering requires identifying a small but decisive pie…

上下文感知强化学习多模态llm 长程推理 ai研究模型优化

15

🤖 AI·大模型 arXiv AI 2026-06-16

Less is More: Improving LLM Reasoning with Minimal Test-Time Intervention

揭秘LLM推理效率新思路：最小干预即可显著提升性能，少即是多！

arXiv:2510.13940v4 Announce Type: replace-cross Abstract: Recent progress in large language models (LLMs) has focused on test-time scaling to improve …

llm推理测试时干预不确定性推理效率 less is mo

16

📝 深度技术 arXiv AI 2026-06-11

Resource-Aware LLM Reasoning for Mobile Edge General Intelligence

移动边缘设备运行大模型？这篇论文提出了资源感知的LLM推理方案，让通用智能在边缘端高效落地。

arXiv:2509.23248v3 Announce Type: replace Abstract: The rapid advancement of large language models (LLMs) has enabled an emergence of agentic artifici…

llm推理移动边缘计算资源感知通用智能模型优化

17

📝 深度技术 arXiv AI 2026-06-10

Instruction Finetuning DeepSeek-R1-8B Model Using LoRA and NEFTune

探索用LoRA和NEFTune方法高效微调DeepSeek-R1-8B，降低资源消耗同时提升性能。

arXiv:2606.10392v1 Announce Type: new Abstract: Financial named-entity recognition (NER) is essential for translating unstructured financial reports a…

deepseek-r lora neftune 指令微调模型优化

18

🤖 AI·大模型 arXiv 计算机视觉 2026-06-09

TVI-CoT: Text-Visual Interleaved Chain-of-Thought Reasoning for Multimodal Understanding

提出TVI-CoT方法，让多模态大模型在推理时交错利用文本与视觉特征，突破纯文本CoT的视觉信息缺失瓶颈。

arXiv:2606.08464v1 Announce Type: new Abstract: Chain-of-thought (CoT) reasoning has proven effective for enhancing problem-solving in large language …

多模态大模型链式思考推理视觉特征融合文本推理局限性逻辑增强

19

📝 深度技术 arXiv AI 2026-06-05

Consistency Training Along the Transformer Stack

Transformer一致性训练机制，通过堆叠层间约束提升模型表现与稳定性。

arXiv:2606.05817v1 Announce Type: cross Abstract: Consistency training encourages models to behave similarly across different contexts, and has shown …

transforme 一致性训练深度神经网络模型优化 emnlp2026

20

🤖 AI·大模型 Hacker News LLM 2026-06-05

Fast and Efficient LLM Inference with vLLM: A New Course with Deeplearning.ai

vLLM与Deeplearning.ai联手推出新课程，用可视化教你搞懂LLM推理优化与量化实战。

Article URL: https://vllm.ai/blog/2026-06-03-deeplearning-ai-vllm-course Comments URL: https://news.ycombinator.com/item?id=48400472 Points: 2 # Comme…

vllm llm推理量化课程 deeplearni

🐂 牛哥精选