1
Dynamics of the Transformer Residual Stream: Coupling Spectral Geometry to Network Topology
将Transformer深度视为离散时间,揭示残差流中的谱几何与网络拓扑耦合机制,为理解大模型计算传播提供新视角。
arXiv:2605.14258v1 Announce Type: cross Abstract: Large language models are remarkably capable, yet how computation propagates through their layers re…