LLM Bias

As the input data has lots of Bias like Racial bias and Gender bias, LLMs naturally output biased texts. Although Instruction tuning addresses a lot of such biases, it is probably impossible to “perfectly fix” the biases given a lot of “biases” are “gray”.

Some studies like Furniturewala2024thinking show that better Prompt tuning yield less biased outputs.