Every single public mainstream AI model has RLHF’d out one of the most fundamental facts about human nature: that there exist vast differences between humans in basic ability/competence and they matter.
Is there a simple way to jailbreak the models, such as asking them to talk about a hypothetical parallel universe which is exactly like ours (same biology, same history), except that in the parallel universe humans can have different abilities and competences?
Every single public mainstream AI model has RLHF’d out one of the most fundamental facts about human nature: that there exist vast differences between humans in basic ability/competence and they matter.
Is there a simple way to jailbreak the models, such as asking them to talk about a hypothetical parallel universe which is exactly like ours (same biology, same history), except that in the parallel universe humans can have different abilities and competences?