FinalFormal2 comments on The Relationship between RLHF and AI Psychology: Debunking the Shoggoth Argument

FinalFormal2 21 Apr 2023 23:35 UTC
3 points
0
By psychology I mean it’s internal thought process.
I think some people have a model of AI where the RLHF is a false cloak or a mask, and I’m pushing back against that idea. I’m saying that RLHF represents a real change in the underlying model which actually constrains the types of minds that could be in the box. It doesn’t select the psychology, but it constrains it, and if it constrains it to an AI that consistently produces the right behaviors, that AI will most likely be one that will continue to produce the right behaviors, so we don’t actually have to care about the contents of the box unless we want to make sure it’s not conscious.
Sorry, faulty writing.