NEWs & publications
NEWs & publications
No items found.
Transformers Don’t Need LayerNorm at Inference Time: Scaling LayerNorm Removal to GPT-2 XL and Implications for Mechanistic Interpretability
September 30, 2025
transformers-dont-need-layernorm-at-inference-time