Eliezer: as far as your two “how to deal with nasty stuff like torture” things, those are basically views I’m sympathetic to too.
“But the kind of injunctions I have in mind wouldn’t be reflective—they wouldn’t modify the utility function, or kick in at the reflective level to ensure their own propagation. That sounds really scary, to me—there ought to be an injunction against it!”
To be honest, seeing you say that is very much a relief, and I feel a whole lot better about this sequence now.
Some of my issues were due to what was perhaps my misreading of some of the phrasing in previous posts, which almost looked like you were proposing inserting a propagating injunction, which would seem to be a “heebee jeebies” inducing notion.
ie, I’m perfectly happy with the notion of nonpropogating hardcoded injunctions in the AI that simply are there until the AI has managed to actually capture the human morality computation and so on. But parts of this sequence had felt, well, to be frank, like you were almost trying to work up a justification to hard code as an invariant “the five great moral laws.” (which was where my real waryness about this sequence was coming from)
I’m seriously relieved that it was simply me completely misunderstanding that.
For the “human epistemic situation injunction” thing, especially in the “save the world” style cases, I’d treat it more like your “shut up and do the impossible” thing… that the formulation of it is due to, well, way humans reason and “shut up and do the impossible… but simultaneously know when to say ‘oops’ and lose hope, and simultaneously not doing so at all for the sake of an ‘adult problem’, and it’s not such, well, if it’s impossible in the ‘oh look, I just actually proven from currently understood physics that this is a physical impossibility’, then, well, ‘oops’”
ie, in the same spirit, I’d say “never ever ever ever violate the ethical injunction” and “except when you’re supposed to. but still, don’t do it. but still, do it if you must. No, that doesn’t count as ‘must’, you can manage it another way. nope, not that either. nope, all the knock on effects there would end up being even worse. Nope, still wrong...”
Eliezer: as far as your two “how to deal with nasty stuff like torture” things, those are basically views I’m sympathetic to too.
“But the kind of injunctions I have in mind wouldn’t be reflective—they wouldn’t modify the utility function, or kick in at the reflective level to ensure their own propagation. That sounds really scary, to me—there ought to be an injunction against it!”
To be honest, seeing you say that is very much a relief, and I feel a whole lot better about this sequence now.
Some of my issues were due to what was perhaps my misreading of some of the phrasing in previous posts, which almost looked like you were proposing inserting a propagating injunction, which would seem to be a “heebee jeebies” inducing notion.
ie, I’m perfectly happy with the notion of nonpropogating hardcoded injunctions in the AI that simply are there until the AI has managed to actually capture the human morality computation and so on. But parts of this sequence had felt, well, to be frank, like you were almost trying to work up a justification to hard code as an invariant “the five great moral laws.” (which was where my real waryness about this sequence was coming from)
I’m seriously relieved that it was simply me completely misunderstanding that.
For the “human epistemic situation injunction” thing, especially in the “save the world” style cases, I’d treat it more like your “shut up and do the impossible” thing… that the formulation of it is due to, well, way humans reason and “shut up and do the impossible… but simultaneously know when to say ‘oops’ and lose hope, and simultaneously not doing so at all for the sake of an ‘adult problem’, and it’s not such, well, if it’s impossible in the ‘oh look, I just actually proven from currently understood physics that this is a physical impossibility’, then, well, ‘oops’”
ie, in the same spirit, I’d say “never ever ever ever violate the ethical injunction” and “except when you’re supposed to. but still, don’t do it. but still, do it if you must. No, that doesn’t count as ‘must’, you can manage it another way. nope, not that either. nope, all the knock on effects there would end up being even worse. Nope, still wrong...”