AGZO: Activation-Guided Zeroth-Order Optimization for LLM Fine-Tuning
创新激活指导的零阶优化方法,大幅提升大模型微调效率。
arXiv:2601.17261v4 Announce Type: replace Abstract: Zeroth-Order (ZO) optimization has emerged as a promising solution for fine-tuning LLMs under stri…
创新激活指导的零阶优化方法,大幅提升大模型微调效率。
arXiv:2601.17261v4 Announce Type: replace Abstract: Zeroth-Order (ZO) optimization has emerged as a promising solution for fine-tuning LLMs under stri…
无需大模型和微调,图基础模型GILT通过上下文学习实现高效推理,为图领域开辟新范式。
arXiv:2510.04567v2 Announce Type: replace-cross Abstract: Graph Neural Networks (GNNs) are powerful tools for processing relational data but often str…
单GPU实现凸优化方法,高效解决LLM偏好对齐难题,降低RLHF计算成本。
arXiv:2605.23244v1 Announce Type: new Abstract: Fine-tuning large language models (LLMs) to align with human preferences has driven the success of sys…
指令微调LLM面临的任务级定向投毒威胁,首个系统性基准PoisonForge发布,助力模型安全评估。
arXiv:2605.23168v1 Announce Type: cross Abstract: When practitioners fine-tune LLMs on unvetted datasets, an adversary can exploit the data supply cha…
新研究用可证明方式保护微调大模型免遭训练数据窃取,同时维持模型效能,隐私与实用兼得。
arXiv:2602.00688v2 Announce Type: replace Abstract: Fine-tuning large language models (LLMs) on sensitive datasets raises privacy concerns, as trainin…
首份系统研究RL微调VLM的鲁棒性与思维链一致性,揭示模型脆弱性根源
arXiv:2602.12506v3 Announce Type: replace Abstract: Reinforcement learning (RL) finetuning has become a key technique for enhancing large language mod…
PyTorch官方推出的后训练微调库torchtune,原生集成LoRA、QLoRA等高效技术,简化大模型适配流程。
arXiv:2605.21442v1 Announce Type: new Abstract: Modern LLMs typically require multistage training pipelines to achieve strong downstream performance, …
用语言代理实现大模型自主微调的创新框架,省去人工干预,让微调过程自动化
arXiv:2603.01712v2 Announce Type: replace-cross Abstract: Fine-tuning large language models for vertical domains remains labor-intensive, requiring pr…
新方法通过logit averaging融合强化学习与监督微调,显著提升LLM后训练的性能和稳定性。
arXiv:2605.20555v1 Announce Type: new Abstract: We introduce a novel method that averages the logits of a frozen reference policy (e.g., SFT) and a tr…
探索指令微调多模态大模型在自然刺激下的脑区对齐模式,交叉验证AI与神经科学
arXiv:2506.08277v3 Announce Type: replace-cross Abstract: Recent voxel-wise multimodal brain encoding studies have shown that multimodal large languag…
OpenAI官方指南:一条命令即可微调GPT-3,快速定制专属AI模型。
Fine-tune with a single command.
StrLoRA提出流式持续视觉指令微调新方法,有效缓解多模态大模型在序列任务中的灾难性遗忘。
arXiv:2605.16353v1 Announce Type: new Abstract: Continual Visual Instruction Tuning (CVIT) enables Multimodal Large Language Models to incrementally a…
提出SMART框架,将预训练模型融入高维非参数变量选择,为微调提供理论基础。
arXiv:2604.12288v2 Announce Type: replace-cross Abstract: Fine-tuning is a widely used strategy for adapting pre-trained models to new tasks, yet its …
混合全微调与低秩适应的新方法,专为后训练场景优化,效率与性能兼得
arXiv:2605.18822v1 Announce Type: new Abstract: Post-training has become essential for adapting large language models (LLMs) to complex downstream beh…
通过调整学习率,简单LoRA即可媲美复杂微调方法,揭示被忽视的关键因素。
arXiv:2602.04998v2 Announce Type: replace Abstract: Low-Rank Adaptation (LoRA) is the prevailing approach for efficient large language model (LLM) fin…
微调大模型实现算法自动设计,探索AI驱动的研究新范式
arXiv:2507.10614v2 Announce Type: replace Abstract: The integration of large language models (LLMs) into automated algorithm design has shown promisin…
面向古籍翻译难题,提出古希腊语到现代希腊语的专用基准,对比LLM与NMT模型的微调效果。
arXiv:2605.18504v1 Announce Type: new Abstract: Machine Translation (MT) for Ancient Greek (AG) to Modern Greek (MG) is a low-resource task, constrain…
从BERT到T5,一篇扎实的NER微调实战对比,技术细节丰富。
arXiv:2605.18462v1 Announce Type: new Abstract: Named entity recognition (NER) has been one of the essential preliminary steps in modern NLP applicati…
介绍一种针对掩码扩散大语言模型的离散倾斜匹配方法,解决RL微调中边际似然难解问题。
arXiv:2604.18739v2 Announce Type: replace Abstract: Masked diffusion large language models (dLLMs) are a promising alternative to autoregressive gener…
提出FediLoRA方法,在联邦微调中解决模态缺失难题,兼顾通信效率与模型性能。
arXiv:2509.06984v3 Announce Type: replace Abstract: Federated Learning with LoRA fine-tuning offers an efficient and privacy-aware solution for instit…