1
The Bayesian Geometry of Transformer Attention
从贝叶斯几何视角重新阐释Transformer注意力机制,揭示其内在概率结构。
arXiv:2512.22471v5 Announce Type: replace Abstract: Transformers often appear to perform Bayesian reasoning in context, but verifying this rigorously …