These recent events have me thinking the opposite: policy and cooperation approaches to making AI go well are doomed – while many people are starting to take AI risk seriously, not enough are, and those who are worried will fail to restrain those who aren’t (where not being risked in a consequence of humans often being quite insane when incentives are at play). The hope lies in somehow developing enough useful AI theory that leading labs adopt and resultantly build an aligned AI even though they never believed they were going to cause AGI ruin.
And so maybe let’s just get everyone to focus on the technical stuff. Actually more doable than wrangling other people to not build unsafe stuff.
That largely depends on where AI safety’s talent has been going, and could go.
I’m thinking that most of the smarter quant thinkers have been doing AI alignment 8 hours a day and probably won’t succeed, especially without access to AI architectures that haven’t been invented yet, and most of the people research policy and cooperation weren’t our best.
If our best quant thinkers are doing alignment research for 8 hours a day with systems that probably aren’t good enough to extrapolate to the crunch time systems, and our best thinkers haven’t been researching policy and coordination (e.g. historically unprecedented coordination takeoffs), then the expected hope from policy and coordination is much higher, and our best quant thinkers should be doing policy and coordination during this time period; even if we’re 4 years away, they can mostly do human research for freshman and sophomore year and go back to alignment research for junior and senior year. Same if we’re two years away.
These recent events have me thinking the opposite: policy and cooperation approaches to making AI go well are doomed – while many people are starting to take AI risk seriously, not enough are, and those who are worried will fail to restrain those who aren’t (where not being risked in a consequence of humans often being quite insane when incentives are at play). The hope lies in somehow developing enough useful AI theory that leading labs adopt and resultantly build an aligned AI even though they never believed they were going to cause AGI ruin.
And so maybe let’s just get everyone to focus on the technical stuff. Actually more doable than wrangling other people to not build unsafe stuff.
That largely depends on where AI safety’s talent has been going, and could go.
I’m thinking that most of the smarter quant thinkers have been doing AI alignment 8 hours a day and probably won’t succeed, especially without access to AI architectures that haven’t been invented yet, and most of the people research policy and cooperation weren’t our best.
If our best quant thinkers are doing alignment research for 8 hours a day with systems that probably aren’t good enough to extrapolate to the crunch time systems, and our best thinkers haven’t been researching policy and coordination (e.g. historically unprecedented coordination takeoffs), then the expected hope from policy and coordination is much higher, and our best quant thinkers should be doing policy and coordination during this time period; even if we’re 4 years away, they can mostly do human research for freshman and sophomore year and go back to alignment research for junior and senior year. Same if we’re two years away.