Circuit Tracing: Revealing Computational Graphs in Language Models

Interpretability, Model interpretability

See also Lindsey2025on. Early works published in Distill: Thread: Circuits, which includes the following articles:

Replacement model and Attribution graph are core concepts.