Studying Large Language Model Generalization with Influence Functions

LLMs, Influence function

Influence functions aim to answer a counterfactual: how would the model’s parameters (and hence its outputs) change if a given sequence were added to the training set?