Circuit Tracing: Revealing Computational Graphs in Language Models
- https://transformer-circuits.pub/2025/attribution-graphs/methods.html
- Emmanuel Ameisen, Jack Lindsey, … Chris Olah, Joshua Batson
Interpretability, Model interpretability
See also Lindsey2025on. Early works published in Distill: Thread: Circuits, which includes the following articles:
Replacement model and Attribution graph are core concepts.