The first issue in my mind is that it’s straightforwardly messing up all business plans those companies and labs have for their staff to leave for 3 months. Leadership will by default be angry and work hard to discredit you and perhaps threaten to fire anyone who accepts your deal.
My yes-and thought here is “okay, can we do this in a way that the various labs will feel is exciting, or win-win?”. Rather than making the deal with individuals, somehow frame it as a partnership with the companies.
Leadership will by default be angry and work hard to discredit you and perhaps threaten to fire anyone who accepts your deal.
“Why is management so insistent I don’t go do this alignment thing if it won’t actually change my view?”
This sounds like an wonderful way to invoke the Streisand effect and cause an internal political upheaval. I would love to see DeepMind leadership overreact to this kind of outreach and attempt to suppress a bunch of Google AI researcher’s moral compunctions through heavy-handed threats. Just imagine someone doing this if Google researchers had concerns about racial bias in their algorithms. It really seems like the dream scenario.
I want to explicitly mark this (in my mind) as “babble” of the sort that is “Hufflepuff Bones” from HPMOR. I don’t want to have this relationship with the leadership at AI labs. I’m way more interested in finding positive sum actions to take (e.g. trades that AI labs are actively happy about) than adversarial moves.
I want them to stop lighting enormous fires next to open fuel repositories, which is unfortunately what these “leaders” are currently paid millions of dollars to do. Given that, I’m not sure what option there is. They are already adversaries. Nonprofit leadership probably doesn’t have much of a vested interest and we should definitely talk to them, but these guys? Seriously, what strategy remains?
Man, wtf is this argument? Yes you should talk to leaders in industry. I have not yet tried at all hard enough on the “work together” strategy to set it on fire at this moment. I don’t think such people they see themselves as adversaries, I don’t think we have been acting like adversaries, and I think this has allowed us to have a fair amount of conversation and cooperation with them.
I feel a bit like you’re saying “a new country is stockpiling nukes, so let’s make sure to quicklystop talking to them”. We’re in a free market economy, everyone is incentivized to build these companies, not just the people leading existing ones. That’s like most of the problem.
I think it’s good and healthy to think about your BATNA and figure out your negotiating position, so thinking about this seems good to me, but just because you might be able to exercise unilateral control doesn’t mean you should, it lowers trust and the ability for everyone to work together on anything.
I’m not confident here and maybe I should already have given up after OpenAI was founded but I’m not ready to call everyone adversaries, it’s pretty damn hard to backtrack on that, and it makes conversation and coordination way way harder.
I think my problem is that I sometimes use “moral culpability” as some sort of proxy for “potential for positive outcomes following dialogue”. Should reiterate that it was always my opinion that we should be doing more outreach to industry leaders, even if my hopes are low, especially if it turns out we haven’t really tried it.
Edit: After further thought I also think the frustration I have with this attitude is:
We’re not going to convince everybody.
Wild success means diverting significant but not necessarily critical amounts of resources (human, monetary, etc.) going toward AI capabilities research toward other less dangerous things.
Less AI capabilities research dries up the very short term money. Someone from #1 who we can’t convince, or just doesn’t care, is going to be mad about this.
So it’s my intuition that, if you’re not willing to annoy e.g. DeepMind’s executive leadership, you are basically unable to commit to any strategy with a chance of working. It sucks too because this is the type of project where one bad organization will still end up killing everybody else, eventually. But this is the problem that must be solved, and it involves being willing to piss some people off.
“Why is management so insistent I don’t go do this alignment thing if it won’t actually change my view?”
There are already perfectly ordinary business reasons for management to not want to lose all their major staff for three months (especially if some of them are needed for fulfilling contractual obligations with existing customers), and the employees have signed job contracts that probably do not provide a clause to just take three months off without permission. So the social expectation is more on the side of “this would be a major concession from management that they’re under no obligation to grant” than “management would be unreasonable not to grant this”.
Thinking out loud about how to address this issue:
It doesn’t have to be the same 3 months for everyone. You could stagger this over the course of 2 years, with an 8th of the relevant researchers taking a sabbatical to give this challenge a shot, at any given time.
This also means that they can build on each other’s work, in serial, if they’re in fact making progress, and can see how others have failed to make progress, if they’re not.
The first issue in my mind is that it’s straightforwardly messing up all business plans those companies and labs have for their staff to leave for 3 months. Leadership will by default be angry and work hard to discredit you and perhaps threaten to fire anyone who accepts your deal.
My yes-and thought here is “okay, can we do this in a way that the various labs will feel is exciting, or win-win?”. Rather than making the deal with individuals, somehow frame it as a partnership with the companies.
“Why is management so insistent I don’t go do this alignment thing if it won’t actually change my view?”
This sounds like an wonderful way to invoke the Streisand effect and cause an internal political upheaval. I would love to see DeepMind leadership overreact to this kind of outreach and attempt to suppress a bunch of Google AI researcher’s moral compunctions through heavy-handed threats. Just imagine someone doing this if Google researchers had concerns about racial bias in their algorithms. It really seems like the dream scenario.
I want to explicitly mark this (in my mind) as “babble” of the sort that is “Hufflepuff Bones” from HPMOR. I don’t want to have this relationship with the leadership at AI labs. I’m way more interested in finding positive sum actions to take (e.g. trades that AI labs are actively happy about) than adversarial moves.
I want them to stop lighting enormous fires next to open fuel repositories, which is unfortunately what these “leaders” are currently paid millions of dollars to do. Given that, I’m not sure what option there is. They are already adversaries. Nonprofit leadership probably doesn’t have much of a vested interest and we should definitely talk to them, but these guys? Seriously, what strategy remains?
Man, wtf is this argument? Yes you should talk to leaders in industry. I have not yet tried at all hard enough on the “work together” strategy to set it on fire at this moment. I don’t think such people they see themselves as adversaries, I don’t think we have been acting like adversaries, and I think this has allowed us to have a fair amount of conversation and cooperation with them.
I feel a bit like you’re saying “a new country is stockpiling nukes, so let’s make sure to quickly stop talking to them”. We’re in a free market economy, everyone is incentivized to build these companies, not just the people leading existing ones. That’s like most of the problem.
I think it’s good and healthy to think about your BATNA and figure out your negotiating position, so thinking about this seems good to me, but just because you might be able to exercise unilateral control doesn’t mean you should, it lowers trust and the ability for everyone to work together on anything.
I’m not confident here and maybe I should already have given up after OpenAI was founded but I’m not ready to call everyone adversaries, it’s pretty damn hard to backtrack on that, and it makes conversation and coordination way way harder.
I think my problem is that I sometimes use “moral culpability” as some sort of proxy for “potential for positive outcomes following dialogue”. Should reiterate that it was always my opinion that we should be doing more outreach to industry leaders, even if my hopes are low, especially if it turns out we haven’t really tried it.
Edit: After further thought I also think the frustration I have with this attitude is:
We’re not going to convince everybody.
Wild success means diverting significant but not necessarily critical amounts of resources (human, monetary, etc.) going toward AI capabilities research toward other less dangerous things.
Less AI capabilities research dries up the very short term money. Someone from #1 who we can’t convince, or just doesn’t care, is going to be mad about this.
So it’s my intuition that, if you’re not willing to annoy e.g. DeepMind’s executive leadership, you are basically unable to commit to any strategy with a chance of working. It sucks too because this is the type of project where one bad organization will still end up killing everybody else, eventually. But this is the problem that must be solved, and it involves being willing to piss some people off.
I am not sure what the right stance is here, and your points seem reasonable. (I am willing to piss people off.)
There are already perfectly ordinary business reasons for management to not want to lose all their major staff for three months (especially if some of them are needed for fulfilling contractual obligations with existing customers), and the employees have signed job contracts that probably do not provide a clause to just take three months off without permission. So the social expectation is more on the side of “this would be a major concession from management that they’re under no obligation to grant” than “management would be unreasonable not to grant this”.
Thinking out loud about how to address this issue:
It doesn’t have to be the same 3 months for everyone. You could stagger this over the course of 2 years, with an 8th of the relevant researchers taking a sabbatical to give this challenge a shot, at any given time.
This also means that they can build on each other’s work, in serial, if they’re in fact making progress, and can see how others have failed to make progress, if they’re not.
Maybe alternatively a big EA actor could just buy some big AGI labs?
That wouldn’t mess up schedules that badly.
google’s entire stock value is agi