牛哥精选 · 本月

1

🤖 AI·大模型 arXiv AI 2026-06-11

MSUE: Multi-Modal Soccer Understanding Expert

融合视觉与文本数据，打造专业足球AI分析师，新论文发布多模态足球理解专家MSUE。

arXiv:2606.12106v1 Announce Type: cross Abstract: This paper presents our solution to the 2026 SoccerNet VQA Challenge. We first develop a cost-effect…

多模态ai 足球理解专家系统计算机视觉论文

2

📄 文档手册 IT 之家 2026-06-11

铠侠预告月球数据中心愿景：SSD 将随 HPE 星载计算机登月，兼顾极端环境与 AI 负载

铠侠携手HPE将SSD送上月球，打造极端环境下的AI数据中心，揭开太空存储新篇章。

IT之家 6 月 11 日消息，铠侠宣布将参加下周在拉斯维加斯举行的 HPE Discover 2026 大会，展示其最新 SSD 解决方案，并透露相关技术未来将用于月球探索任务。铠侠表示，月球上出现首个数据中心只是时间问题。铠侠在闪存领域布局已久，数年前便与慧与科技（HPE）合作，为 HPE 星…

铠侠预告月球数据中心愿景将随星载计算机登兼顾极端环境

3

📝 深度技术 arXiv 计算机视觉 2026-06-10

Leveraging Metric Depth for Relative Depth Prediction

利用预训练模型零样本能力，解决足球场景下训练样本少的单目深度估计难题，方法新颖且实用。

arXiv:2606.10628v1 Announce Type: new Abstract: We present our solution to the 2025 SoccerNet Monocular Depth Estimation Competition Challenge. Predic…

深度估计计算机视觉足球场景零样本学习单目深度

4

📝 深度技术 arXiv 计算机视觉 2026-06-09

HDRAgent: An Agentic Framework for Multi-Exposure HDR Imaging

用大模型代理自动化多曝光HDR成像流程，HDRAgent框架让复杂影像合成更智能。

arXiv:2606.09110v1 Announce Type: new Abstract: Most existing multi-exposure HDR methods follow a fixed feed-forward reconstruction paradigm, making t…

hdr成像多曝光智能体框架计算机视觉深度学习

5

📝 深度技术 arXiv 计算机视觉 2026-06-09

Leveraging NeRF-Rendered Images for 3D Gaussian Splatting

巧妙融合NeRF高质量渲染与3DGS高速渲染，利用互补优势提升新视角合成性能。

arXiv:2606.09034v1 Announce Type: new Abstract: Neural radiance field (NeRF) and 3D Gaussian splatting (3DGS) are two mainstream approaches for novel …

nerf 3d gaussia 新视角合成渲染加速机器学习

6

📝 深度技术 arXiv 计算机视觉 2026-06-09

Leveraging Morphology for Historical Script Metrological Analysis

用形态学分析历史手稿的计量特征，这项研究为数字人文提供了新的技术视角。

arXiv:2606.09446v1 Announce Type: new Abstract: Advances in handwritten text recognition have enabled large-scale transcription of historical document…

历史脚本形态学计量分析手稿分析计算语言学

7

🤖 AI·大模型量子位 2026-06-08

让矩阵归模拟，让逻辑归数字！这家中国团队重新定义了计算机

黄仁勋要算一万步，这家公司的芯片只需一步

让矩阵归模拟让逻辑归数字这家中国团队重新定义了计算机

8

🔧 开发工具 IT 之家 2026-06-06

OpenCV 5 发布：升级全新 DNN 引擎、原生支持大模型

计算机视觉库重磅升级，全新DNN引擎原生支持大模型，性能与易用性全面提升。

IT之家 6 月 6 日消息，OpenCV 团队本周正式发布了 OpenCV 5。据介绍，二十多年来，OpenCV 一直是计算机视觉研究、机器人技术、嵌入式视觉、AI 应用、工业检测、AR / VR、医学成像以及无数生产系统的基础。如今，该库在 GitHub 上拥有超过 86,000 颗 star…

发布升级全新引擎原生支持大模计算机视觉

9

📝 深度技术 arXiv 计算机视觉 2026-06-05

Imagine Before You Predict: Interleaved Latent Visual Reasoning for Video Event Prediction

先想象再预测：论文提出交错潜在视觉推理新方法，提升视频事件预测准确性与可解释性。

arXiv:2606.05769v1 Announce Type: new Abstract: Video event prediction (VEP) requires models to infer unobserved future states from partial video evid…

视频事件预测潜在视觉推理交错推理视频理解 ai方法

10

📝 深度技术 arXiv AI 2026-06-05

Evaluating Agentic Configuration Repair for Computer Networks

首个系统性评估AI代理自动修复计算机网络配置错误的能力，揭示代理性能与关键影响因素

arXiv:2606.06212v1 Announce Type: new Abstract: Misconfigurations in computer networks remain a major source of critical Internet outages. Research is…

网络配置修复智能代理大模型自动化运维网络管理

11

📝 深度技术 arXiv 计算机视觉 2026-06-03

Eliciting Complex Spatial Reasoning in MLLMs through Wide-Baseline Matching

CVPR 2026 新作：用宽基线匹配技术激发多模态大模型的空间推理潜力，突破复杂场景理解瓶颈。

arXiv:2606.03577v1 Announce Type: new Abstract: Wide-baseline matching (WBM) requires integrating geometric understanding, viewpoint changes, fine-gra…

多模态大模型空间推理宽基线匹配 cvpr 2026 计算机视觉

12

📝 深度技术 arXiv 机器学习 2026-06-02

Domain Adaptation with a Single Vision-Language Embedding

利用单一视觉语言嵌入实现高效域适应，方法简洁且效果显著。

arXiv:2410.21361v2 Announce Type: replace-cross Abstract: Domain adaptation has been extensively investigated in computer vision but still requires ac…

域适应视觉-语言嵌入计算机视觉迁移学习 ijcv

13

📝 深度技术 arXiv 计算机视觉 2026-06-02

AGILE: Hand-Object Interaction Reconstruction from Video via Agentic Generation

SIGGRAPH 2026提出AGILE框架，用智能体生成从视频精准重建手-物体三维交互。

arXiv:2602.04672v4 Announce Type: replace Abstract: Reconstructing dynamic hand-object interactions from monocular videos is critical for dexterous ma…

hand-objec video reco agentic ge siggraph 2 计算机视觉

14

📝 深度技术 arXiv 计算机视觉 2026-06-02

Chroma Clues: Leveraging Color Statistics to Detect Synthetic Images

合成图像无处遁形？用颜色统计特征高效分辨真假图像，方法新颖且实用。

arXiv:2606.02224v1 Announce Type: new Abstract: The evolution and dissemination of AI-synthesized images is occurring at an unprecedented rate. Image …

合成图像检测颜色统计 ai生成内容图像取证计算机视觉

15

📝 深度技术 arXiv 机器学习 2026-06-02

DenseMLLM: Standard Multimodal LLMs for Dense Prediction

标准多模态大模型如何突破粗粒度限制，实现像素级稠密预测任务。

arXiv:2602.14134v2 Announce Type: replace-cross Abstract: Multimodal Large Language Models (MLLMs) have demonstrated exceptional capabilities in high-…

多模态大语言模型稠密预测计算机视觉语义分割深度估计

16

📝 深度技术 arXiv 机器学习 2026-06-02

Towards Optimal Robustness in Learning-Augmented Paging

探索学习增强分页算法如何实现最优鲁棒性，理论突破值得关注

arXiv:2606.01342v1 Announce Type: cross Abstract: Learning-augmented paging has been extensively studied in recent years. A key advantage over naive M…

学习增强分页鲁棒性在线算法分页算法机器学习

17

🚀 产品观察 Hacker News LLM 2026-06-02

Why Study CS? Thoughts on LLM-assisted software engineering

AI写代码占比飙升，计算机科学学习还有必要吗？作者从现实争议出发，探讨编程教育本质。

Article URL: https://kmicinski.com/claude-code-and-why-study-cs Comments URL: https://news.ycombinator.com/item?id=48365109 Points: 2 # Comments: 0

llm ai辅助编程软件工程 cs教育职业发展

18

📝 深度技术 arXiv 计算机视觉 2026-06-02

Policy-based Foveated Imaging and Perception

基于策略的注视点成像与感知方法，有望突破传统视觉计算效率瓶颈。

arXiv:2606.02565v1 Announce Type: new Abstract: Ultra-high-resolution image sensors offer the potential to capture fine spatial details critical for m…

中心凹成像强化学习计算机视觉感知策略优化

19

📝 深度技术 arXiv 计算机视觉 2026-06-02

ToolFG: Towards Well-Grounded Fine-Grained Image Classification

基于细粒度图像分类的新框架ToolFG，强调分类结果的可靠性与可解释性，推进视觉基础模型在细粒度场景的应用。

arXiv:2606.02518v1 Announce Type: new Abstract: Fine-grained image classification (FGIC) has broad applications and has attracted significant research…

细粒度图像分类 toolfg 深度学习计算机视觉分类可解释性

20

📝 深度技术 arXiv 计算机视觉 2026-06-02

Training-Free Object-Agnostic Jam Detection in Fulfillment Centers

无需训练即可检测配送中心任意物体卡堵，零样本视觉方案高效实用。

arXiv:2606.00321v1 Announce Type: new Abstract: In fulfillment centers, diverse objects move continuously from inbound to outbound operations and can …

卡堵检测自动化计算机视觉零样本工业应用

🐂 牛哥精选

MSUE: Multi-Modal Soccer Understanding Expert

铠侠预告月球数据中心愿景：SSD 将随 HPE 星载计算机登月，兼顾极端环境与 AI 负载

Leveraging Metric Depth for Relative Depth Prediction

HDRAgent: An Agentic Framework for Multi-Exposure HDR Imaging

Leveraging NeRF-Rendered Images for 3D Gaussian Splatting

Leveraging Morphology for Historical Script Metrological Analysis

让矩阵归模拟，让逻辑归数字！这家中国团队重新定义了计算机

OpenCV 5 发布：升级全新 DNN 引擎、原生支持大模型

Imagine Before You Predict: Interleaved Latent Visual Reasoning for Video Event Prediction

Evaluating Agentic Configuration Repair for Computer Networks

Eliciting Complex Spatial Reasoning in MLLMs through Wide-Baseline Matching

Domain Adaptation with a Single Vision-Language Embedding

AGILE: Hand-Object Interaction Reconstruction from Video via Agentic Generation

Chroma Clues: Leveraging Color Statistics to Detect Synthetic Images

DenseMLLM: Standard Multimodal LLMs for Dense Prediction

Towards Optimal Robustness in Learning-Augmented Paging

Why Study CS? Thoughts on LLM-assisted software engineering

Policy-based Foveated Imaging and Perception

ToolFG: Towards Well-Grounded Fine-Grained Image Classification

Training-Free Object-Agnostic Jam Detection in Fulfillment Centers

📅 日期