There is a massive difference between a technique not working and a technique being way less likely to work.
A: 1% chance of working given that we get to complete it, doesn’t kill everyone before completing
B: 10% chance of working given that we get to complete it, 95% chance of killing everyone before completing
You pick A here. You can’t just ignore the ”implement step produces disaster” bit. Maybe we’re not in this situation (obviously it changes based on what the odds of each bit actually are), but you can’t just assume we’re not in this situation and say “Ah, well, B has a much higher chance of working than A, so that’s all, we’ve gotta go with B”.
Are you advocating as option A, ‘deduce a full design by armchair thought before implementing anything’? The success probability of that isn’t 1%. It’s zero, to as many decimal places as makes no difference.
We’re probably talking past each other. I’m saying “no you don’t get to build lots of general AIs in the process of solving the alignment problem and still stay alive” and (I think) you’re saying “no you don’t get to solve the alignment problem without writing a ton of code, lots of it highly highly related to AI”. I think both of those are true.
There is a massive difference between a technique not working and a technique being way less likely to work.
A: 1% chance of working given that we get to complete it, doesn’t kill everyone before completing
B: 10% chance of working given that we get to complete it, 95% chance of killing everyone before completing
You pick A here. You can’t just ignore the ”
implement
step produces disaster” bit. Maybe we’re not in this situation (obviously it changes based on what the odds of each bit actually are), but you can’t just assume we’re not in this situation and say “Ah, well, B has a much higher chance of working than A, so that’s all, we’ve gotta go with B”.Are you advocating as option A, ‘deduce a full design by armchair thought before implementing anything’? The success probability of that isn’t 1%. It’s zero, to as many decimal places as makes no difference.
We’re probably talking past each other. I’m saying “no you don’t get to build lots of general AIs in the process of solving the alignment problem and still stay alive” and (I think) you’re saying “no you don’t get to solve the alignment problem without writing a ton of code, lots of it highly highly related to AI”. I think both of those are true.
Right, yes, I’m not suggesting the iterated coding activity can or should include ‘build an actual full-blown superhuman AGI’ as an iterated step.