InstructGPT

GPT, Large language models

Instruction tuning model.

It combines GPT with Reinforcement learning (Training large language models with reinforcement learning), by using human input to create Reward model.