Well, then, it seems like almost all the difficulty is in the value and formalization problems. Once we’ve really formalized it, it’s 99% of the way to machine code from where it started as human intuition.
Via, say, doubly-indirect meta-ethics? Well, we need to decide that that’s really the decision algorithm that’s going to result in the right answer, both that it’s ethically correct and predictably converges on that ethically correct result.
Explicitly figuring out what our values are and formalizing them, is only one possible sequence of steps to get AI with our values.
It seems like most people don’t think that this approach will work. So there are a number of proposals to use AI itself to assist in this process. E.g. “motivational scaffolding” sounds like it solves the second step (formalizing the values.)
I’m actually not clear on what exactly falls under the ‘value loading problem’. These seem like somewhat separate issues:
Figuring out what we want in any sense (e.g. utilitarianism with lots of details nailed down)
Translating ‘any sense’ into being able to write down what we want in a formal way
Causing the values to be the motivations of an AI
Is the ‘value loading problem’ some subset of these?
That’s the “value problem”
That’s the “formalization problem”
This is the “value loading problem”
Well, then, it seems like almost all the difficulty is in the value and formalization problems. Once we’ve really formalized it, it’s 99% of the way to machine code from where it started as human intuition.
Doesn’t that mean the value loading strategy is an alternative to the (direct) formalization strategy?
Via, say, doubly-indirect meta-ethics? Well, we need to decide that that’s really the decision algorithm that’s going to result in the right answer, both that it’s ethically correct and predictably converges on that ethically correct result.
Explicitly figuring out what our values are and formalizing them, is only one possible sequence of steps to get AI with our values.
It seems like most people don’t think that this approach will work. So there are a number of proposals to use AI itself to assist in this process. E.g. “motivational scaffolding” sounds like it solves the second step (formalizing the values.)