InstructGPT
Instruction tuning model.
It combines GPT with Reinforcement learning (Training large language models with reinforcement learning), by using human input to create Reward model.
Instruction tuning model.
It combines GPT with Reinforcement learning (Training large language models with reinforcement learning), by using human input to create Reward model.