InstructGPT

GPT, LLMs

Instruction tuning model.

It combines GPT with Reinforcement learning (Training large language models with reinforcement learning), by using human input to create Reward model.