On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines

BERT