Self-Instruct: Aligning Language Model with Self Generated Instructions
- https://arxiv.org/abs/2212.10560
- Yizhong Wang, Yeganeh Kordi, Swaroop Mishra, Alisa Liu, Noah A. Smith, Daniel Khashabi, Hannaneh Hajishirzi
Instruction tuning, Training large language models with reinforcement learning