I have to say that I am not getting substantial discussion about what I actually argued in the paper.
The first reason seems to be clarity. I didn’t get what your primary point was until recently, even after carefully reading the paper. (Going back to the section on DLI, context, goals, and values aren’t mentioned until the sixth paragraph, and even then it’s implicit!)
The second reason seems to be that there’s not much to discuss, with regards to the disagreement. Consider this portion of the parent comment:
You go on to suggest that whether the AI planning mechanism would take the chef’s motives into account, and whether it would be nontrivial to do so …. all of that is irrelevant in the light of the fact that this is a superintelligence, and taking context into account is the bread and butter of a superintelligence. It can easily do that stuff
I think my division between cleverness and wisdom at the end of this long comment clarifies this issue. Taking context into account is not necessarily the bread and butter of a clever system; many fiendishly clever systems just manipulate mathematical objects without paying any attention to context, and those satisfy human goals only because the correct mathematical objects have been carefully selected for them to manipulate. But I agree with you that taking context into account is the bread and butter of a wise system. There’s no way for a wise system to manipulate conceptual objects without paying attention to context, because context is a huge part of concepts.
It seems like everyone involved agrees that a human-aligned superwisdom is safe, even if it’s also superclever: as Ged muses about Ogion in A Wizard of Earthsea, “What good is power when you’re too wise to use it?”
Which brings us to:
That conflict is important, and yet no one wants to go there and talk about it.
I restate the conflict this way: an AI that misinterprets what its creators meant for it to do is not superwise. Once we’ve defined wisdom appropriately, I think everyone involved would agree with that, and would agree that talking about a superwise AI that misinterprets what its creators meant for it to do is incoherent.
But… I don’t see why that’s a conflict, or important. The point of MIRI is to figure out how to develop human-aligned superwisdom before someone develops supercleverness without superwisdom, or superwisdom without human-alignment.
The main conflicts seem to be that MIRI is quick to point out that specific designs aren’t superwise, and that MIRI argues that AI designs in general aren’t superwise by default. But I don’t see how stating that there is inherent wisdom in AI by virtue of it being a superintelligence is a meaningful response to their assumption that there is no inherent wisdom in AI except for whatever wisdom has been deliberately designed. That’s why they care so much about deliberately designing wisdom!
The first reason seems to be clarity. I didn’t get what your primary point was until recently, even after carefully reading the paper. (Going back to the section on DLI, context, goals, and values aren’t mentioned until the sixth paragraph, and even then it’s implicit!)
The second reason seems to be that there’s not much to discuss, with regards to the disagreement. Consider this portion of the parent comment:
I think my division between cleverness and wisdom at the end of this long comment clarifies this issue. Taking context into account is not necessarily the bread and butter of a clever system; many fiendishly clever systems just manipulate mathematical objects without paying any attention to context, and those satisfy human goals only because the correct mathematical objects have been carefully selected for them to manipulate. But I agree with you that taking context into account is the bread and butter of a wise system. There’s no way for a wise system to manipulate conceptual objects without paying attention to context, because context is a huge part of concepts.
It seems like everyone involved agrees that a human-aligned superwisdom is safe, even if it’s also superclever: as Ged muses about Ogion in A Wizard of Earthsea, “What good is power when you’re too wise to use it?”
Which brings us to:
I restate the conflict this way: an AI that misinterprets what its creators meant for it to do is not superwise. Once we’ve defined wisdom appropriately, I think everyone involved would agree with that, and would agree that talking about a superwise AI that misinterprets what its creators meant for it to do is incoherent.
But… I don’t see why that’s a conflict, or important. The point of MIRI is to figure out how to develop human-aligned superwisdom before someone develops supercleverness without superwisdom, or superwisdom without human-alignment.
The main conflicts seem to be that MIRI is quick to point out that specific designs aren’t superwise, and that MIRI argues that AI designs in general aren’t superwise by default. But I don’t see how stating that there is inherent wisdom in AI by virtue of it being a superintelligence is a meaningful response to their assumption that there is no inherent wisdom in AI except for whatever wisdom has been deliberately designed. That’s why they care so much about deliberately designing wisdom!