Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking https://arxiv.org/abs/2403.09629 Eric Zelikman, Georges Harik, Yijia Shao, Varuna Jayasiri, Nick Haber, Noah D. Goodman See also Snell2024scaling