Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking

See also Snell2024scaling