A European Multi-Center Breast Cancer MRI Dataset
首个欧洲多中心乳腺癌MRI数据集发布,为医学影像AI提供高质量训练与评估基准
arXiv:2506.00474v3 Announce Type: replace-cross Abstract: Early detection of breast cancer is critical for improving patient outcomes. While mammograp…
首个欧洲多中心乳腺癌MRI数据集发布,为医学影像AI提供高质量训练与评估基准
arXiv:2506.00474v3 Announce Type: replace-cross Abstract: Early detection of breast cancer is critical for improving patient outcomes. While mammograp…
想让Claude帮你管账?手把手教你接入Plaid银行数据,让AI看到真实消费记录。
Claude remembers your projects, your writing style, the context you've shared across conversations. It does not know what you spent last quarter. So w…
首个纵向专注冥想脑电图数据集与基准,助力冥想神经机制研究。
arXiv:2605.22893v1 Announce Type: cross Abstract: We introduce a novel Longitudinal Focused Attention Meditation Electroencephalography (L-FAME) datas…
首个关注阿拉伯语社会凝聚力与冲突平衡的数据集,不同于传统毒性检测。
arXiv:2605.22447v1 Announce Type: new Abstract: The study of online discourse has become central to understanding societal polarization. While much re…
首个大规模3D脑MRI视觉问答基准,覆盖5大临床领域,推动医学影像AI理解突破。
arXiv:2605.20525v1 Announce Type: cross Abstract: We present NeuroQA, a large-scale benchmark for visual question answering in 3D brain magnetic reson…
ICML 2026收录:LLM基准数据集需抗污染,防止训练数据泄露导致评估失真。
arXiv:2605.19999v1 Announce Type: new Abstract: Benchmark datasets are critical for reproducible, reliable, and discriminative evaluation of LLMs. How…
Google内部定制LLM实战:万亿token数据集+中训策略,专攻企业软件工程场景。
Article URL: https://arxiv.org/abs/2605.16517 Comments URL: https://news.ycombinator.com/item?id=48202484 Points: 1 # Comments: 0
首个大规模真实聊天机器人对话数据集,142K对话揭示平台设计差异对用户行为的影响
arXiv:2512.17843v4 Announce Type: replace Abstract: By evaluating Large Language Models (LLMs) through uniform, text-only interfaces, current academic…
首个大规模真实人机对话数据集,捕捉用户对话背后的想法,揭示LLM交互中的思维过程。
arXiv:2605.20087v1 Announce Type: new Abstract: Conversational AI has now reached billions of users, yet existing datasets capture only what people sa…
提出异构感知数据集调度方法,提升音频大模型训练效率与效果的新方案。
arXiv:2605.19101v1 Announce Type: cross Abstract: Training general-purpose Audio Large Language Models (ALLMs) across diverse datasets is essential fo…
CVPR 2026 Workshop 论文,推出首个大规模风格化场景文本修复数据集与基准,填补领域空白。
arXiv:2605.17309v1 Announce Type: new Abstract: We present StyleText, a large-scale dataset and benchmark for localized scene-text inpainting with sty…
首个专为脑肿瘤MRI解读打造的VQA数据集,助力医学影像AI研究新基准。
arXiv:2605.17140v1 Announce Type: cross Abstract: Brain tumor diagnosis is largely dependent on Magnetic Resonance Imaging (MRI) evaluation, which req…
首个阿拉伯语真实口语交互数据集,专为研究LLM语音助手中ASR错误影响而构建,填补领域空白。
arXiv:2605.16364v1 Announce Type: cross Abstract: Large Language Models (LLMs) voice assistants are commonly built as cascaded Automatic Speech recogn…
结合Lean与理论计算机科学,可规模生成形式-非形式配对的定理证明挑战,助力AI数学推理研究。
arXiv:2508.15878v2 Announce Type: replace-cross Abstract: Formal theorem proving (FTP) has emerged as a critical foundation for evaluating the reasoni…
针对医疗健康表格数据,提出编排与评估合成数据的系统化框架,填补数据隐私与可用性平衡的空白。
arXiv:2605.17758v1 Announce Type: new Abstract: Synthetic data is widely used in healthcare to create datasets that are similar to original data but w…
提出凸数据集估值方法,解决LLM后训练中数据集选择的成本与性能权衡问题
arXiv:2605.16704v1 Announce Type: new Abstract: Improving LLM performance on downstream tasks sometimes requires leveraging auxiliary datasets during …
开源项目Stera将普通iPhone升级为研究级空间数据采集系统,并开源10M帧数据集,为具身AI世界模型提供高质量训练数据。
We are releasing Project Stera - an open source, end-to-end pipeline that turns a commodity iPhone into a research-grade capture system for embodied A…
用数据集+模型+基准全方位提升多模态大模型跨视图空间智能,突破单视角局限。
arXiv:2605.18621v1 Announce Type: new Abstract: Spatial intelligence requires multimodal large language models (MLLMs) to move beyond single-view perc…
评估大模型在孟加拉语医学视觉问答上的表现,首个专用数据集与基准测试,填补低资源语言医疗AI空白。
arXiv:2605.18111v1 Announce Type: new Abstract: Recent advancements in Large Language Models (LLMs) and Large Vision Language Models (LVLMs) have enab…
为自动驾驶场景设计带距离标注的交通感知问答数据集,评估VLM空间推理能力
arXiv:2511.13397v2 Announce Type: replace-cross Abstract: The remarkable progress of Vision-Language Models (VLMs) on a variety of tasks has raised in…