Does Time Have Its Place? Temporal Heads: Where Language Models Recall Time-specific Information
- https://arxiv.org/abs/2502.14258
- Yein Park, Chanwoong Yoon, Jungwoo Park, Minbyul Jeong, Jaewoo Kang
We discover temporal heads, specific attention heads primarily responsible for processing temporal knowledge through circuit analysis.
TKC: temporal knowledge circuit.
Feed data with temporal information, like “In 2003, the president of South Korea was … ” and then perform ablation to identify the heads.
Circuit analysis considers transformer’s computation as a DAG between attention heads and MLP modules, input node, and output node. and the edges are between these nodes. A circuit is a subgraph that explain a specific behavior.