FWIW, this also reminded me of some discussion in Paul’s post on capability amplification, where Paul asks whether we can even define good behavior in some parts of capability-space, e.g.:
The next step would be to ask: can we sensibly define “good behavior” for policies in the inaccessible part H? I suspect this will help focus our attention on the most philosophically fraught aspects of value alignment.
I’m not sure if that’s relevant to your point, but it seemed like you might be interested.
FWIW, this also reminded me of some discussion in Paul’s post on capability amplification, where Paul asks whether we can even define good behavior in some parts of capability-space, e.g.:
I’m not sure if that’s relevant to your point, but it seemed like you might be interested.