A guess: MB is saying “MIRI doesn’t say the AI won’t have the function somewhere, but does say it’s hard to have an externally usable, explicit human value function”. And then saying “and GPT gives us that”, and therefore MIRI should update.
[...]
Straw-EY: Complexity of value means you can’t just get the make-AI-care part to happen by chance; it’s a small target.
Straw-MB: Ok but now we have a very short message pointing to roughly human values: just have a piece of code that says “and now call GPT and ask it what’s good”. So now it’s a very small number of bits.
I consider this a reasonably accurate summary of this discussion, especially the part I’m playing in it. Thanks for making it more clear to others.
I consider this a reasonably accurate summary of this discussion, especially the part I’m playing in it. Thanks for making it more clear to others.