Beyond Explained Variance: A Cautionary Tale of PCA
PCA中解释方差并非万能指标,本文通过实例警示其潜在陷阱,值得数据分析者关注。
arXiv:2605.13520v2 Announce Type: replace-cross Abstract: We address shortcomings of principal component analysis (PCA) for visualizing high-dimension…
PCA中解释方差并非万能指标,本文通过实例警示其潜在陷阱,值得数据分析者关注。
arXiv:2605.13520v2 Announce Type: replace-cross Abstract: We address shortcomings of principal component analysis (PCA) for visualizing high-dimension…
提出PQR框架自动生成多样真实用户查询,精准发现QA agent的失败边界,补足对抗性测试的盲区
arXiv:2605.16551v1 Announce Type: new Abstract: Evaluating LLM-based agents remains challenging because identifying meaningful failure cases often req…
基于增量语义知识图谱的状态化多轮对话评估方法,提升对话系统评测的连贯性与深度。
arXiv:2605.16650v1 Announce Type: new Abstract: Evaluating multi-turn dialogue systems remains challenging because response quality depends not only o…
探讨如何借鉴语言习得装置,通过合成语言预训练提升大模型的数据效率,为AI发展带来新思路。
arXiv:2605.16758v1 Announce Type: new Abstract: Large Language Models (LLMs) remain substantially less data-efficient than humans. Pre-pretraining (PP…
轻量级大模型在法律AI中展现潜力,这篇论文系统探索了小于2B参数模型在法院观点生成任务上的表现。
arXiv:2605.16770v1 Announce Type: new Abstract: Criminal Court View Generation (CVG) is a critical task in Legal Artificial Intelligence (Legal AI), i…
提出专家引导的后合并量化方法,利用合并权重锚定,在低资源部署中平衡模型压缩与性能。
arXiv:2605.16882v1 Announce Type: new Abstract: Low-resource deployment constraints have made model quantization essential for deploying neural networ…
突破大模型长上下文推理瓶颈,百步内将全注意力高效转为稀疏,平衡效率与精度。
arXiv:2605.16928v1 Announce Type: new Abstract: Long-context inference in large language models is bottlenecked by the quadratic cost of full attentio…
揭示推理预算为何无法调节人类与大型模型的认知成本对齐,努力是天花板而非旋钮。
arXiv:2605.16938v1 Announce Type: new Abstract: Large Reasoning Models (LRMs) generate chain-of-thought traces whose length tracks human reaction time…
扩散LLM无需外部教师,通过“展开回退”策略自我提升推理效率,开辟模型加速新方向。
arXiv:2605.16941v1 Announce Type: new Abstract: Diffusion Large Language Models (DLLMs) promise fast parallel generation, yet open-source DLLMs still …
关注LLM幻觉?HalluScore填补阿拉伯语基准空白,专测大语言模型问答中的幻觉问题。
arXiv:2605.17007v1 Announce Type: new Abstract: Large language models (LLMs) have achieved remarkable progress in natural language generation, but rem…
探讨多模态大模型在低资源语言环境下的实际构建挑战与方案
arXiv:2605.17152v1 Announce Type: new Abstract: Multimodal LLMs are evolving from vision-language to tri-modality that see, hear, and read, yet pipeli…
生物医学问答新突破:结构化证据建模+不确定性感知融合,提升答案准确性与可靠性。
arXiv:2605.17435v1 Announce Type: new Abstract: Biomedical question answering often requires decisions from retrieved literature whose relevance, qual…
提出假设验证方法定位LLM多智能体系统故障原因,提升系统可靠性
arXiv:2605.17467v1 Announce Type: new Abstract: Large language model-driven multi-agent systems (LLM-MAS) excel at complex tasks, yet unreliable agent…
基于3.96亿乌克兰法院引用,揭示20年间共引预测能力随时间衰减的规律。
arXiv:2605.17639v1 Announce Type: new Abstract: Co-citation structure is widely assumed to provide stable retrieval signal in legal information system…
百万级临床笔记重写质量系统性评估,揭示LLM文本生成多维评价短板
arXiv:2605.17775v1 Announce Type: new Abstract: Large language models (LLMs) can generate or synthesize clinical text for a wide range of applications…
评估LLMLingua-2在扩散大模型LLaDA上的提示压缩效果,探索高效推理新路径。
arXiv:2605.17932v1 Announce Type: new Abstract: Prompt compression reduces inference cost and context length in large language models, but prior evalu…
让大模型生成显式向量化代码,大幅提升程序性能,来自arXiv的前沿研究。
arXiv:2605.17978v1 Announce Type: new Abstract: Vectorization via Single Instruction, Multiple Data (SIMD) architectures is a cornerstone of high-perf…
多智能体LLM工作流的离线评估与迭代优化新框架,即将亮相ACL 2026,助力复杂协作场景调优。
arXiv:2605.18032v1 Announce Type: new Abstract: Multi-agent LLM workflows -- systems composed of multiple role-specific LLM calls -- often outperform …
将个性化LLM代理部署到边缘设备,实现P2P协作,突破本地能力限制
arXiv:2605.18067v1 Announce Type: new Abstract: Deploying large language model (LLM) on edge device enables personalized LLM agents for various users.…
基于解释的提示优化新方法,让大模型提示更透明、可解释。
arXiv:2605.18113v1 Announce Type: new Abstract: Prompt optimization has often been framed as a discrete search problem to find high-performing and rob…