Prompting GPT-3 To Be Reliable

LLMs, GPT 3, Prompt tuning

The paper measures the reliability of a Language model based on four factors: generalizability, fairness, calibration, and factuality.