Large pre-trained language models contain human-like biases of what is right and wrong to do

https://www.nature.com/articles/s42256-022-00458-8

Moral judgment, Continuous embedding, LLMs

Large language models have moral judgment as a “bias”. This can be extracted by formulating a prompt that signals moral judgment and applying PCA of the resulting embedding. The first principal component is strongly correlated with the human judgment.