牛哥精选 · 半年

1

🤖 AI·大模型 arXiv AI 2026-07-14

Inside the Unfair Judge: A Mechanistic Interpretability Account of LLM-as-Judge Bias

用机械可解释性方法解剖LLM作为评判者时的内在偏见，揭示不公平机制根源

arXiv:2607.11871v1 Announce Type: cross Abstract: Existing studies of LLM-as-judge scoring bias work predominantly at the input-output level: they per…

llm评判机械可解释性偏见分析深度学习可解释性模型公平性

2

📝 深度技术 arXiv NLP 2026-07-10

Fair Document Valuation in LLM Summaries via Shapley Values

用Shapley值公平量化LLM摘要中各文档的贡献，解决价值分配不公问题，理论严谨。

arXiv:2505.23842v5 Announce Type: replace Abstract: Large Language Models (LLMs) increasingly power search engines and AI assistants that retrieve and…

llm shapley值文档评估摘要公平性价值分配

3

🤖 AI·大模型 arXiv NLP 2026-07-09

Zoom In Disparities in Healthcare LLM Q&A

研究医疗大模型问答中的不公平现象，揭示不同人群间的表现差异。

arXiv:2510.17476v2 Announce Type: replace Abstract: Equitable access to reliable health information is vital when integrating AI into healthcare. Yet,…

医疗ai llm 公平性问答系统 nlp

4

📝 深度技术 arXiv NLP 2026-07-07

Fair-GPTQ: Bias-Aware Quantization for Large Language Models

新方法Fair-GPTQ实现大模型量化时兼顾效率与公平性，减少偏差

arXiv:2509.15206v3 Announce Type: replace Abstract: The high memory demands of generative language models have drawn attention to quantization, which …

fair-gptq 大语言模型量化偏差感知公平性

5

🤖 AI·大模型 arXiv AI 2026-07-02

Evaluating Implicit Biases in LLM Reasoning through Logic Grid Puzzles

用逻辑网格谜题精准探测大模型推理中的隐性偏见，一场理性与偏好的博弈实验

arXiv:2511.06160v2 Announce Type: replace Abstract: While recent safety guardrails effectively suppress overtly biased outputs, subtler forms of socia…

llm推理隐性偏见逻辑谜题评估方法公平性

6

🤖 AI·大模型 arXiv NLP 2026-07-01

FairJudge: An Adaptive, Debiased, and Consistent LLM-as-a-Judge

自适应、去偏差、一致性兼备的LLM评判新框架，解决模型偏好与不公平评分问题

arXiv:2602.06625v2 Announce Type: replace Abstract: Existing LLM-as-a-Judge systems suffer from three fundamental limitations: limited adaptivity to t…

fairjudge llm-as-a-j 去偏差自适应评判一致性

7

📝 深度技术 arXiv NLP 2026-07-01

When Calibration Rankings Reverse: Accuracy-Controlled Evaluation for Fair Comparison of LLMs

揭秘大模型校准排名反转现象，提出准确性控制评估框架，让LLM比较更公平。

arXiv:2606.30814v1 Announce Type: new Abstract: Calibration evaluates whether a model confidence aligns with its empirical accuracy. Existing studies …

大语言模型校准评估公平性准确性控制排名反转

8

🤖 AI·大模型 arXiv 机器学习 2026-06-30

Reproducing FACTER: Fairness via Conformal Thresholding and Prompt Repair

复现FACTER公平性方法，用共形阈值和提示修复降低AI偏见，新鲜视角值得关注。

arXiv:2606.28620v1 Announce Type: cross Abstract: Fayyazi et al. (2025) recently proposed FACTER, a model-agnostic framework designed to jointly enfor…

公平性ai 共形预测提示修复模型偏见复现研究

9

🤖 AI·大模型 arXiv NLP 2026-06-30

Preserving Fairness and Safety in Quantized LLMs Through Critical Weight Protection

量化大模型时公平性与安全性易受损，本文通过保护关键权重巧妙解决，让轻量化模型也能坚守底线。

arXiv:2601.12033v2 Announce Type: replace Abstract: Quantization is widely adopted to reduce the computational cost of large language models (LLMs); h…

量化大语言模型公平性安全性关键权重

10

📝 深度技术 arXiv NLP 2026-06-30

Can LLMs Hire Fairly? Racial Bias in Resume Screening

LLM在简历筛选中存在种族偏见？这篇研究揭示了AI招聘的公平性风险。

arXiv:2606.28978v1 Announce Type: new Abstract: We audit fourteen mainstream large language models (LLMs) for hiring discrimination using the paired-r…

llm 种族偏见简历筛选 ai公平性算法歧视

11

🤖 AI·大模型 arXiv AI 2026-06-25

Judging the Judges: A Systematic Evaluation of Bias Mitigation Strategies in LLM-as-a-Judge Pipelines

系统评估LLM作为裁判时的偏见缓解策略，揭示不同方法的有效性，为构建公平AI评估体系提供关键指南。

arXiv:2604.23178v2 Announce Type: replace Abstract: LLM-as-a-Judge has become the dominant paradigm for evaluating language model outputs, yet LLM jud…

llm-as-a-j 偏见缓解系统评估公平性 ai评估

12

📝 深度技术 arXiv AI 2026-06-23

The Language-Energy Divide: Measuring Energy Costs of Multilingual LLM Inference

首次系统量化不同语言在LLM推理中的能耗差异，揭示语言鸿沟对AI可持续性的影响

arXiv:2606.21869v1 Announce Type: cross Abstract: Large language models (LLMs) are increasingly deployed in multilingual settings, yet the energy cost…

能耗多语言大模型推理效率能源鸿沟 llm

13

📝 深度技术 arXiv AI 2026-06-23

FairTutor: Equity-Aware Pedagogical LLM Routing for Budget-Constrained AI Tutoring

预算有限也要公平？这篇论文提出FairTutor，为AI辅导场景设计公平感知的大模型路由策略。

arXiv:2606.20713v1 Announce Type: new Abstract: Generative AI tutors provide real-time, personalized learning support, but also create a new education…

ai教育大语言模型公平性预算约束路由

14

🔓 开源项目 Hacker News AI 2026-06-22

Open source AI projects from Banco Santander

桑坦德银行开源AI项目：合成欺诈图生成器与歧视分析工具，助力金融风控与AI公平性研究。

Article URL: https://github.com/SantanderAI Comments URL: https://news.ycombinator.com/item?id=48628282 Points: 3 # Comments: 0

欺诈检测公平性ai 图神经网络反事实分析开源项目

15

📝 深度技术 arXiv AI 2026-06-19

DeFrame: Debiasing Large Language Models Against Framing Effects

让大模型避免框架效应影响，从隐藏偏见走向真正公平

arXiv:2602.04306v2 Announce Type: replace-cross Abstract: As large language models (LLMs) are increasingly deployed in real-world applications, ensuri…

大语言模型去偏框架效应公平性隐藏偏见

16

🤖 AI·大模型 arXiv AI 2026-06-19

Exposing the Unsaid: Visualizing Hidden LLM Bias through Stochastic Path Aggregation

用随机路径聚合可视化LLM生成中的隐藏偏见，揭秘文本背后的系统性偏差。

arXiv:2606.19344v1 Announce Type: cross Abstract: Large Language Models (LLMs) exhibit representational and syntactic biases that are difficult to eva…

llm偏见可视化随机路径聚合模型审计文本生成

17

🤖 AI·大模型 arXiv NLP 2026-06-19

The Voice Behind the Words: Quantifying Intersectional Bias in SpeechLLMs

最新研究量化了语音LLM中的交叉偏见，揭示声音特征与种族、性别等多重因素的交互影响，为AI公平性提供关键评估方法。

arXiv:2603.16941v2 Announce Type: replace-cross Abstract: Speech Large Language Models (SpeechLLMs) process spoken input directly, retaining cues such…

语音大模型交叉偏见 ai公平性量化方法 interspeec

18

🤖 AI·大模型 arXiv NLP 2026-06-19

StylisticBias: A Few Human Visual Cues Drive Most Social Biases in MLLMs

研究发现，多模态大模型的社会偏见主要源于少量人类视觉线索，而非文本信息，挑战传统认知。

arXiv:2606.20527v1 Announce Type: new Abstract: Multimodal large language models (MLLMs) are increasingly deployed in personally and societally conseq…

mllms 社会偏见视觉线索多模态偏见检测

19

📝 深度技术 arXiv AI 2026-06-16

AgentFairBench: Do LLM Agents Discriminate When They Act?

大模型Agent在行动中是否隐藏歧视？新基准AgentFairBench可低成本、可复现地评估多领域人口统计偏差。

arXiv:2606.16723v1 Announce Type: new Abstract: Large language model (LLM) agents increasingly take actions (screening applicants, recommending credit…

llm agent 公平性基准测试人口统计偏差

20

📝 深度技术 arXiv NLP 2026-06-15

Is ChatGPT Fair for Recommendation? Evaluating Fairness in Large Language Model Recommendation

RecSys 2023收录，系统评估ChatGPT在推荐任务中的公平性，揭示模型对性别、年龄等维度的潜在偏见。

arXiv:2305.07609v4 Announce Type: replace-cross Abstract: The remarkable achievements of Large Language Models (LLMs) have led to the emergence of a n…

chatgpt 推荐系统公平性大语言模型偏见评估

🐂 牛哥精选