Self-Instruct: Aligning Language Model with Self Generated Instructions

Instruction tuning, Training large language models with reinforcement learning