Hi, I’m always fascinated by people with success probabilities that aren’t either very low or ‘it’ll probably be fine’.
I have this collection of intuitions (no more than that):
(1) ‘Some fool is going to build a mind’,
(2) ‘That mind is either going to become a god or leave the fools in position to try again, repeat’,
(3) ‘That god will then do whatever it wants’.
It doesn’t seem terribly relevant these days, but there’s another strand that says:
(4) ‘we have no idea how to build minds that want specific things’ and
(5) ‘Even if we knew how to build a mind that wanted a specific thing, we have no idea what would be a good thing’ .
These intuitions don’t leave me much room for optimism, except in the sense that I might be hopelessly wrong and, in that case, I know nothing and I’ll default back to ‘it’ll probably be fine’.
Presumably you’re disagreeing with one of (1), (2), or (3) and one of (4) or (5).
I believe that we might solve alignment in time and aligned AI will protect us from unaligned AI. I’m not sure how to translate it to your 1-3 (the “god” will do whatever it wants, but it will want what we want so there’s no problem). In terms of 4-5, I guess I disagree with both or rather disagree that this state of ignorance will necessarily persist.
Neat, so in my terms you think we can pull off 4 and 5 and get it all solid enough to set running before anyone else does 123?
4 and 5 have always looked like the really hard bits to me, and not the sort of thing that neural networks would necessarily be good at, so good luck!
But please be careful to avoid fates-worse-than-death by getting it almost right but not quite right. I’m reasonably well reconciled with death, but I would still like to avoid doing worse if possible.
Hi, I’m always fascinated by people with success probabilities that aren’t either very low or ‘it’ll probably be fine’.
I have this collection of intuitions (no more than that):
(1) ‘Some fool is going to build a mind’,
(2) ‘That mind is either going to become a god or leave the fools in position to try again, repeat’,
(3) ‘That god will then do whatever it wants’.
It doesn’t seem terribly relevant these days, but there’s another strand that says:
(4) ‘we have no idea how to build minds that want specific things’ and
(5) ‘Even if we knew how to build a mind that wanted a specific thing, we have no idea what would be a good thing’ .
These intuitions don’t leave me much room for optimism, except in the sense that I might be hopelessly wrong and, in that case, I know nothing and I’ll default back to ‘it’ll probably be fine’.
Presumably you’re disagreeing with one of (1), (2), or (3) and one of (4) or (5).
Which ones and where does the 30% from?
I believe that we might solve alignment in time and aligned AI will protect us from unaligned AI. I’m not sure how to translate it to your 1-3 (the “god” will do whatever it wants, but it will want what we want so there’s no problem). In terms of 4-5, I guess I disagree with both or rather disagree that this state of ignorance will necessarily persist.
Neat, so in my terms you think we can pull off 4 and 5 and get it all solid enough to set running before anyone else does 123?
4 and 5 have always looked like the really hard bits to me, and not the sort of thing that neural networks would necessarily be good at, so good luck!
But please be careful to avoid fates-worse-than-death by getting it almost right but not quite right. I’m reasonably well reconciled with death, but I would still like to avoid doing worse if possible.