“I tend to think that learning and following the norms of a particular culture (further discussion) isn’t too hard a problem for an AGI which is motivated to do so”. If the AGI is motivated to do so then the value learning problem is already solved and nothing else matters (in particular my post becomes irrelevant), because indeed it can learn the further details in whichever way it wants. We somehow already managed to create an agent with an internal objective that points to Bedouin culture (human values), which is the whole/complete problem.
I could say more about the rest of your comment but just checking if the above changes your model of my model significantly?
Also regarding “I think I’m much more open-minded than you to …”: to be clear, I’m not at all convinced about this I’m open to this distinction not mattering at all. I hope I didn’t come accross as not open minded about this.
An AGI with the motivation “I want to follow London cultural norms (whatever those are)”, versus
An AGI with the motivation “I want to follow the following 500 rules (avoid public nudity, speak English, don’t lick strangers, …), which by the way comprise London cultural norms as I understand them”
Normally I think of “value learning” (or in this case, “norm learning”) as related to the second bullet point—i.e., the AI watches one or more people and learn their actual preferences and desires. I also had the impression that your OP was along the lines of the second (not first) bullet point.
If that’s right, and if we figure out how to make an agent with the first-bullet-point motivation, then I wouldn’t say that “the value learning problem is already solved”, instead I would say that we have made great progress towards safe & beneficial AGI in a way that does not involve “solving value learning”. Instead the agent will hopefully go ahead and solve value learning all by itself.
(I’m not confident that my definitions here are standard or correct, and I’m certainly oversimplifying in various ways.)
“I tend to think that learning and following the norms of a particular culture (further discussion) isn’t too hard a problem for an AGI which is motivated to do so”. If the AGI is motivated to do so then the value learning problem is already solved and nothing else matters (in particular my post becomes irrelevant), because indeed it can learn the further details in whichever way it wants. We somehow already managed to create an agent with an internal objective that points to Bedouin culture (human values), which is the whole/complete problem.
I could say more about the rest of your comment but just checking if the above changes your model of my model significantly?
Also regarding “I think I’m much more open-minded than you to …”: to be clear, I’m not at all convinced about this I’m open to this distinction not mattering at all. I hope I didn’t come accross as not open minded about this.
There’s sorta a use/mention distinction between:
An AGI with the motivation “I want to follow London cultural norms (whatever those are)”, versus
An AGI with the motivation “I want to follow the following 500 rules (avoid public nudity, speak English, don’t lick strangers, …), which by the way comprise London cultural norms as I understand them”
Normally I think of “value learning” (or in this case, “norm learning”) as related to the second bullet point—i.e., the AI watches one or more people and learn their actual preferences and desires. I also had the impression that your OP was along the lines of the second (not first) bullet point.
If that’s right, and if we figure out how to make an agent with the first-bullet-point motivation, then I wouldn’t say that “the value learning problem is already solved”, instead I would say that we have made great progress towards safe & beneficial AGI in a way that does not involve “solving value learning”. Instead the agent will hopefully go ahead and solve value learning all by itself.
(I’m not confident that my definitions here are standard or correct, and I’m certainly oversimplifying in various ways.)