Nick Hay, a CEV might also need to gather more information, nondestructively and unobtrusively. So even before the first overwrite you need a fair amount of FAI content just so it knows what’s valuable and shouldn’t be changed in the process of looking at it, though before the first overwrite you can afford to be conservative about how little you do. But English sentences don’t work, so “look but don’t touch” is not a trivial criterion (it contains magical categories).
Wei Dai, I thought that I’d already put in some substantial work in cashing out the concept of “moral growth” in terms of a human library of non-introspectively-accessible circuits that respond to new factual beliefs and to new-found arguments from a space of arguments that move us, as well as changes in our own makeup thus decided. In other words, I thought I’d tried to cash out “reflective equilibrium” in naturalistic terms. I don’t think I’m finished with this effort, but are you unsatisfied with any of the steps I’ve taken so far? Where?
All, a Friendly AI does not necessarily maximize an object-level expected utility.
Nick Hay, a CEV might also need to gather more information, nondestructively and unobtrusively. So even before the first overwrite you need a fair amount of FAI content just so it knows what’s valuable and shouldn’t be changed in the process of looking at it, though before the first overwrite you can afford to be conservative about how little you do. But English sentences don’t work, so “look but don’t touch” is not a trivial criterion (it contains magical categories).
Wei Dai, I thought that I’d already put in some substantial work in cashing out the concept of “moral growth” in terms of a human library of non-introspectively-accessible circuits that respond to new factual beliefs and to new-found arguments from a space of arguments that move us, as well as changes in our own makeup thus decided. In other words, I thought I’d tried to cash out “reflective equilibrium” in naturalistic terms. I don’t think I’m finished with this effort, but are you unsatisfied with any of the steps I’ve taken so far? Where?
All, a Friendly AI does not necessarily maximize an object-level expected utility.