I read the whole thing, glad I did. It really makes me think that many of AI safety’s best minds are doing technical work like technical alignment 8 hours a day, when it would be better for them to do 2 hours a day to keep their skills honed, and spend 6 hours a day acting as generalists to think through the most important problems of the moment.
They should have shared their reasons/excuses for the firing. (For some reason, in politics/corporate politics, people try to be secretive all the time and this seems-to-me to be very stupid in like 80+% of cases, including this one.)
Hard disagree in the OpenAI case. I’m putting >50% that they were correctly worried about people correctly deducing all kinds of things from honest statements, because AI safety is unusually smart and bayesian. There’s literally prediction markets here.
I’m putting >50% on that alone; also, if the true reason was anything super weird e.g. Altman accepting bribes or cutting deals with NSA operatives, then it would also be reasonable not to share it, even if AI safety didn’t have tons of high-agency people that made it like herding cats.
That this makes it a lot harder for our cluster to be trusted to be cooperative/good faith/competent partners in things...
If the things you want people to do differently are costly, e.g. your safer AI is more expensive, but you are seen as untrustworthy, low-integrity, low-tranparency, low political competence, then I think you’ll have a hard time getting buy in for it.
I think this gets into the complicated issue of security dilemmas; AI safety has a tradeoff of sovereignty and trustworthiness, since groups that are more powerful and sovereign have a risk of betraying their allies and/or going on the offensive (a discount rate since the risk accumulates over time), but not enough sovereignty means the group can’t defend itself against infiltration and absorption.
These recent events have me thinking the opposite: policy and cooperation approaches to making AI go well are doomed – while many people are starting to take AI risk seriously, not enough are, and those who are worried will fail to restrain those who aren’t (where not being risked in a consequence of humans often being quite insane when incentives are at play). The hope lies in somehow developing enough useful AI theory that leading labs adopt and resultantly build an aligned AI even though they never believed they were going to cause AGI ruin.
And so maybe let’s just get everyone to focus on the technical stuff. Actually more doable than wrangling other people to not build unsafe stuff.
That largely depends on where AI safety’s talent has been going, and could go.
I’m thinking that most of the smarter quant thinkers have been doing AI alignment 8 hours a day and probably won’t succeed, especially without access to AI architectures that haven’t been invented yet, and most of the people research policy and cooperation weren’t our best.
If our best quant thinkers are doing alignment research for 8 hours a day with systems that probably aren’t good enough to extrapolate to the crunch time systems, and our best thinkers haven’t been researching policy and coordination (e.g. historically unprecedented coordination takeoffs), then the expected hope from policy and coordination is much higher, and our best quant thinkers should be doing policy and coordination during this time period; even if we’re 4 years away, they can mostly do human research for freshman and sophomore year and go back to alignment research for junior and senior year. Same if we’re two years away.
I read the whole thing, glad I did. It really makes me think that many of AI safety’s best minds are doing technical work like technical alignment 8 hours a day, when it would be better for them to do 2 hours a day to keep their skills honed, and spend 6 hours a day acting as generalists to think through the most important problems of the moment.
Hard disagree in the OpenAI case. I’m putting >50% that they were correctly worried about people correctly deducing all kinds of things from honest statements, because AI safety is unusually smart and bayesian. There’s literally prediction markets here.
I’m putting >50% on that alone; also, if the true reason was anything super weird e.g. Altman accepting bribes or cutting deals with NSA operatives, then it would also be reasonable not to share it, even if AI safety didn’t have tons of high-agency people that made it like herding cats.
I think this gets into the complicated issue of security dilemmas; AI safety has a tradeoff of sovereignty and trustworthiness, since groups that are more powerful and sovereign have a risk of betraying their allies and/or going on the offensive (a discount rate since the risk accumulates over time), but not enough sovereignty means the group can’t defend itself against infiltration and absorption.
The situation with slow takeoff means that historically unprecedented things will happen and it’s not clear what the correct course of action is for EA and AI safety. I’ve argued that targeted influence is already a significant risk due to the social media paradigm already being really good at human manipulation by default and due to major governments and militaries already being interested in the use of AI for information warfare. But that’s only one potential facet of the sovereignty-tradeoff problem and it’s only going to get more multifaceted from here; hence why we need more Rubys and Wentworths spending more hours on the problem.
These recent events have me thinking the opposite: policy and cooperation approaches to making AI go well are doomed – while many people are starting to take AI risk seriously, not enough are, and those who are worried will fail to restrain those who aren’t (where not being risked in a consequence of humans often being quite insane when incentives are at play). The hope lies in somehow developing enough useful AI theory that leading labs adopt and resultantly build an aligned AI even though they never believed they were going to cause AGI ruin.
And so maybe let’s just get everyone to focus on the technical stuff. Actually more doable than wrangling other people to not build unsafe stuff.
That largely depends on where AI safety’s talent has been going, and could go.
I’m thinking that most of the smarter quant thinkers have been doing AI alignment 8 hours a day and probably won’t succeed, especially without access to AI architectures that haven’t been invented yet, and most of the people research policy and cooperation weren’t our best.
If our best quant thinkers are doing alignment research for 8 hours a day with systems that probably aren’t good enough to extrapolate to the crunch time systems, and our best thinkers haven’t been researching policy and coordination (e.g. historically unprecedented coordination takeoffs), then the expected hope from policy and coordination is much higher, and our best quant thinkers should be doing policy and coordination during this time period; even if we’re 4 years away, they can mostly do human research for freshman and sophomore year and go back to alignment research for junior and senior year. Same if we’re two years away.