Second, because then we’d have to invest a lot of time explaining the logic behind each decision, or else face waves of criticism for decisions that appear arbitrary when one merely publishes the decision and not the argument.
Are the arguments not made during the board meetings? Or do you guys talk ahead of time and just formalize the decisions during the board meetings?
In any case, I think you should invest more time explaining the logic behind your decisions, and not just make the decisions themselves more transparent. If publishing board meeting minutes is not the best way to do that, then please think about some other way of doing it. I’ll list some of the benefits of doing this, in case you haven’t thought of some of them:
encourage others to emulate you and think strategically about their own choices
allow outsiders to review your strategic thinking and point out possible errors
assure donors and potential donors that there is good reasoning behind your strategic decisions
improve exchange of strategic ideas between everyone working on existential risk reduction
The arguments are strewn across dozens of conversations in and out of board meetings (mostly out).
As for finding other ways to explain the logic behind our decisions, I agree, and I’m working on it. One qualification I would add, however, is that I predict more benefit to my strategic thinking from one hour with Paul Christiano and one hour with Nick Bostrom than from spending four hours to write up my strategic thinking on subject X and publishing it so that passersby can comment on it. It takes a lot of effort to be so well-informed about these issues that one can offer valuable strategic advice. But for some X we have already spent those many productive hours with Christiano and Bostrom and so on, and it’s a good marginal investment to write up our strategic thinking on X.
This reminds me a bit of Eliezer’s excuse when he was resisting calls for him to publish his TDT ideas on LW:
Unfortunately this “timeless decision theory” would require a long sequence to write up
I suggest you may be similarly overestimating the difficulty of explaining your strategic ideas/problems to a sufficiently large audience to get useful feedback. Why not just explain them the same way that you would explain to Christiano and Bostrom? If some among the LW community don’t understand, they can ask questions and others could fill them in.
The decision theory discussions on LW generated significant progress, but perhaps more importantly created a pool of people with strong interest in the topic (some of whom ended up becoming your research associates). Don’t you think the same thing could happen with Singularity strategies?
I suggest you may be similarly overestimating the difficulty of explaining your strategic ideas/problems to a sufficiently large audience to get useful feedback...
Yes, I would get some useful feedback, but I also predict a negative effect: When people don’t have enough background knowledge to make what I say sound reasonable to them, I’ll get penalized for sounding crazy in the same way that I’m penalized when I try to explain AGI to an intuitive Cartesian dualist.
By penalized, I mean something like the effect that Scott Adams (author of Dilbert) encountered while blogging:
I hoped that people who loved the blog would spill over to people who read Dilbert, and make my flagship product stronger. Instead, I found that if I wrote nine highly popular posts, and one that a reader disagreed with, the reaction was inevitably “I can never read Dilbert again because of what you wrote in that one post.” Every blog post reduced my income, even if 90% of the readers loved it. And a startling number of readers couldn’t tell when I was serious or kidding, so most of the negative reactions were based on misperceptions.
Anyway, you also wrote:
The decision theory discussions on LW generated significant progress, but perhaps more importantly created a pool of people with strong interest in the topic (some of whom ended up becoming your research associates). Don’t you think the same thing could happen with Singularity strategies?
If so, then not for the same reasons. I think people got interested in decision theory because they could see results. But it’s hard to feel you’ve gotten a result in something like strategy, where we may never know whether or not one strategy was counterfactually better, or at least won’t be confident about that for another 5 years. Decision theory offers the opportunity for results that most people in the field can agree on.
The “results” in decision theory we’ve got so far are so tenuous that I believe their role is primarily to somewhat clarify the problem statement for what remains to be done (a big step compared to complete confusion in the past, but not quite clear (-ly motivated) math). The ratchet of science hasn’t clicked yet, even if rational evidence is significant, which is the same problem you voice for strategy discussion.
If so, then not for the same reasons. I think people got interested in decision theory because they could see results. But it’s hard to feel you’ve gotten a result in something like strategy, where we may never know whether or not one strategy was counterfactually better, or at least won’t be confident about that for another 5 years. Decision theory offers the opportunity for results that most people in the field can agree on.
At FHI they sometimes sit around a whiteboard and discuss weird AI-boxing ideas or weird acquire-relevant-influence ideas, and feel as though they are making progress when something sounds more-promising than usual, leads to other interesting ideas, etc. We could too. I suspect it would create a similar set of interested people capable of having strategy ideas, though probably less math-inclined than the decision theory folk, and with more surrounding political chaos.
Okay; that changes my attitude a bit. But FHI’s core people are unlikely to produce the Scott Adams effect in response to strategic discussion. Do you or Wei think it’s reasonable for me to worry about that when discussing strategy in detail amongst, say, LWers — most of whom have far less understanding of the relevant issues (by virtue of not working on them every weeks for months or years)?
I agree that detailed exploration of Singularity strategies would alienate some LW-ers, and some in the SingInst fan base. It is possible that this is reason enough to avoid such discussion; my guess is that it is not, but I could easily be wrong here, and many think it is.
I was mostly responding to the [paraphrased] “we can’t discuss it publicly because it would take too long”, and “it wouldn’t work to create an informed set of strategists because there wouldn’t be a sense of progress”; I’ve said sentences like that before, and, when I said them, they were excuses/rationalizations. My actual reason was something like: “I’d like to avoid alienating people, and I’d like to avoid starting conflicts whose outcomes I cannot predict.”
I agree that detailed exploration of Singularity strategies would alienate some LW-ers, and some SingInst-ers.
It’ll alienate some SingInst-ers? That’s a troubling sign. Aren’t most SingInst-ers at least vaguely competent rationalists who are actually interested in Singularity options? Yet they will be alienated by mere theoretical exploration of the domain? What has your HR department been doing?
I agree that detailed exploration of Singularity strategies would alienate some LW-ers, and some SingInst-ers.
From a public relations viewpoint this sentence alone is worse than any particular detail could possible be. Because it not only allows, but forces people to imagine what horrible strategies you could possible explore and pursue. Strategies that are bad enough that you not only believe that even the community most closely related to SI would be alienated by them, but that you are also unable to support those explorations with rational arguments.
Personally I don’t want to contribute anything to an organisation which admits to explore strategies that are unacceptable by most people. And I wouldn’t suggest anyone else to do so. Yet I would neither be willing to to contribute if you were secretive about your strategic explorations. I just don’t trust you people, I never did. And I am still horrified by how people who actually believe that what you are saying is true and possible are willing to trust your small group blindly to shape the universe.
A paperclip maximizer is just a transformation of the universe into a state of almost no suffering. But a friendly AI that isn’t quite friendly, or one that is biased by the ideas of a small group of abnormal and psychopathic people, could increase negative utility dramatically.
I agree that detailed exploration of Singularity strategies would alienate some LW-ers, and some SingInst-ers.
From a public relations viewpoint this sentence alone is worse than any particular detail could possible be.
No, I don’t agree with this. I predict that whatever strategies AnnaSalamon has in mind would alienate someone unless those strategies were very anodyne or vague. If the sample of listeners is big enough there will usually be someone to take issue with just about any idea one voices.
Because it not only allows, but forces people to imagine what horrible strategies you could possible explore and pursue.
How true is that? In my case it just makes me try to imagine whether there are any strategies AnnaSalamon could propose that wouldn’t perturb anyone. When it comes to the singularity I draw a blank, as it’s a big enough issue that just about anything she or I or you could say about it will bother somebody.
I disagree that AS’s weak statement that “detailed exploration of Singularity strategies would alienate some LW-ers” tells you very much at all about the nature of those strategies. I expect most conceivable strategies would piss someone off, so I’d say her claim communicates less than 1 bit of information about those strategies.
Based on the rest of your comment I think you’ve read AnnaSalamon’s statement as one implying that SI’s strategies are unusually objectionable or alienating; maybe that’s what she meant but it doesn’t seem to be what she wrote.
Based on the rest of your comment I think you’ve read AnnaSalamon’s statement as one implying that SI’s strategies are unusually objectionable or alienating;
Which is the right strategy. Humans are unfriendly. The group around AnnaSalamon is trying to take over and shape the universe according to their idea of what is right and good.
If you are making decisions based on the worst case scenario—as you are clearly doing when it comes to artificial intelligence, if you support friendly AI research—then you should do the same when it comes to human beings.
It isn’t enough to talk to them, to review their output and conclude that they are most likely friendly. Doing so and contributing money is aking to letting an AI, that is not provably friendly, out of the box. They either have to prove that they are friendly or make all their work transparent. Otherwise the right thing to do is to label them as terrorists and tell them to fuck off.
You could just as reasonably have written that comment if AnnaSalamon had never posted in this thread, though. My argument here isn’t with your broader attitude to FAI/SI, it’s that I think it’s unfair to pounce on a very low-information statement like “detailed exploration of Singularity strategies would alienate some LW-ers, and some SingInst-ers” and write it off as terrible PR that implies SI’s considering horrible strategies.
...it’s unfair to pounce on a very low-information statement like “detailed exploration of Singularity strategies would alienate some LW-ers, and some SingInst-ers”...
I think that it does convey quite a lot information. I already know that people associated with SI and LW accept a lot of strategic thinking that would be considered everything from absurd to outright psychopathic within different circles. If she says that the strategies they explore would even alienate some people associated with LW, let alone SI, then that’s really bad.
I think you underestimate the amount of information that a natural language sentence can carry and signal.
...and write it off as terrible PR that implies SI’s considering horrible strategies.
It is abundantly clear that SI is really bad at PR. I assign a high probability to the possibility that her and other members of the SI are revealing a lot of what is going on behind the scenes by being careless about their communication.
If she says that the strategies they explore would even alienate some people associated with LW, let alone SI, then that’s really bad.
I disagree. LWers have a range of opinions on AI & the singularity (yes, those opinions are less diverse than the general population’s, but I don’t see them being sufficiently less diverse for your argument to go through). There are already quite a few LWers who’re SI sceptics to a degree. I’m also sure there are LWers who, at the moment, basically agree with SI but would spurn it if it announced a more specific strategy for handling AI/the singularity. I think this would be true for most possible strategies SI could announce. I’d expect the same basic argument to hold for SI (though I’m less sure because I know less about SI).
I think you underestimate the amount of information that a natural language sentence can carry and signal.
Quite possible! But in any case, a sentence can carry lots of information about one thing, but not another. One has to look at the probability of a sentence or claim conditional on a specific thing. As I see it, P(AS says some people would be alienated | SI has a terrible secret strategy) is about equal to P(AS says some people would be alienated | SI has an un-terrible secret strategy), so the likelihood ratio is about one, and AnnaSalamon’s belief discriminates poorly between those two particular hypotheses.
It is abundantly clear that SI is really bad at PR. I assign a high probability to the possibility that her and other members of the SI are revealing a lot of what is going on behind the scenes by being careless about their communication.
Plausible, but I doubt it’s true for this specific example.
As I see it, P(AS says some people would be alienated | SI has a terrible secret strategy) is about equal to P(AS says some people would be alienated | SI has an un-terrible secret strategy), so the likelihood ratio is about one...
If I was to accept your estimation then the associated utility of P(people alienated | terrible strategy) and P(people alienated | un-terrible strategy) would force you to act according to the first possibility.
I don’t follow. Do you mean that the potential disutility of SI having a terrible strategy is so much bigger than the potential utility of SI having an un-terrible strategy that, given equal likelihoods, I should act against SI? If so, I disagree.
Quite possible! But in any case, a sentence can carry lots of information about one thing, but not another. One has to look at the probability of a sentence or claim conditional on a specific thing. As I see it, P(AS says some people would be alienated | SI has a terrible secret strategy) is about equal to …
Blah blah blah...full stop. We’re talking about the communication of primates with other primates. Evolution honed your skills to detect the intention and possible bullshit in the output of other primates. Use your intuition!
I disagree. LWers have a range of opinions on AI & the singularity …
I am not sure what you are getting at. If she thinks that there are strategies that should be kept secrete for political reasons or whatever and admits it, that’s bad from any possible viewpoint.
I have. My gut didn’t raise a red flag when I read AnnaSalamon’s post, but it did when I read yours.
I am not sure what you are getting at.
I was giving a reason for my claim that there’d be someone on LW/in SI who’d be alienated by all but the blandest of strategies.
If she thinks that there are strategies that should be kept secrete for political reasons or whatever and admits it, that’s bad from any possible viewpoint.
Maybe she thinks that and maybe she doesn’t, but either way she didn’t admit it. (At least not in the post I’m talking about. I haven’t read AS’s whole comment history.)
To my intuitions you sound exactly like a bitter excluded nobody attacking someone successful and popular. You DON’T talk like someone who sees through the lies of an evil greedy deceiver and honestly wants people to examine what he says and come to the correct opinion.
It isn’t enough to talk to them, to review their output and conclude that they are most likely friendly. Doing so and contributing money is aking to letting an AI, that is not provably friendly, out of the box. They either have to prove that they are friendly or make all their work transparent. Otherwise the right thing to do is to label them as terrorists and tell them to fuck off.
I think the “mostly harmless” phrase still applies. These look like kids with firecrackers. The folk we should watch out for are more likely to be the Chinese, the military, hedge funds—and so on.
Maybe you can give an example of the kind of thing that you’re worried about? What might you say that could get you penalized for sounding crazy?
Could get them penalized for sounding crazy? Those people believe into the possibility of heaven and hell and believe that merely thinking about decision and game theoretic conjectures might be dangerous.
Are the arguments not made during the board meetings? Or do you guys talk ahead of time and just formalize the decisions during the board meetings?
In any case, I think you should invest more time explaining the logic behind your decisions, and not just make the decisions themselves more transparent. If publishing board meeting minutes is not the best way to do that, then please think about some other way of doing it. I’ll list some of the benefits of doing this, in case you haven’t thought of some of them:
encourage others to emulate you and think strategically about their own choices
allow outsiders to review your strategic thinking and point out possible errors
assure donors and potential donors that there is good reasoning behind your strategic decisions
improve exchange of strategic ideas between everyone working on existential risk reduction
The arguments are strewn across dozens of conversations in and out of board meetings (mostly out).
As for finding other ways to explain the logic behind our decisions, I agree, and I’m working on it. One qualification I would add, however, is that I predict more benefit to my strategic thinking from one hour with Paul Christiano and one hour with Nick Bostrom than from spending four hours to write up my strategic thinking on subject X and publishing it so that passersby can comment on it. It takes a lot of effort to be so well-informed about these issues that one can offer valuable strategic advice. But for some X we have already spent those many productive hours with Christiano and Bostrom and so on, and it’s a good marginal investment to write up our strategic thinking on X.
This reminds me a bit of Eliezer’s excuse when he was resisting calls for him to publish his TDT ideas on LW:
I suggest you may be similarly overestimating the difficulty of explaining your strategic ideas/problems to a sufficiently large audience to get useful feedback. Why not just explain them the same way that you would explain to Christiano and Bostrom? If some among the LW community don’t understand, they can ask questions and others could fill them in.
The decision theory discussions on LW generated significant progress, but perhaps more importantly created a pool of people with strong interest in the topic (some of whom ended up becoming your research associates). Don’t you think the same thing could happen with Singularity strategies?
Yes.
Yes, I would get some useful feedback, but I also predict a negative effect: When people don’t have enough background knowledge to make what I say sound reasonable to them, I’ll get penalized for sounding crazy in the same way that I’m penalized when I try to explain AGI to an intuitive Cartesian dualist.
By penalized, I mean something like the effect that Scott Adams (author of Dilbert) encountered while blogging:
Anyway, you also wrote:
If so, then not for the same reasons. I think people got interested in decision theory because they could see results. But it’s hard to feel you’ve gotten a result in something like strategy, where we may never know whether or not one strategy was counterfactually better, or at least won’t be confident about that for another 5 years. Decision theory offers the opportunity for results that most people in the field can agree on.
The “results” in decision theory we’ve got so far are so tenuous that I believe their role is primarily to somewhat clarify the problem statement for what remains to be done (a big step compared to complete confusion in the past, but not quite clear (-ly motivated) math). The ratchet of science hasn’t clicked yet, even if rational evidence is significant, which is the same problem you voice for strategy discussion.
At FHI they sometimes sit around a whiteboard and discuss weird AI-boxing ideas or weird acquire-relevant-influence ideas, and feel as though they are making progress when something sounds more-promising than usual, leads to other interesting ideas, etc. We could too. I suspect it would create a similar set of interested people capable of having strategy ideas, though probably less math-inclined than the decision theory folk, and with more surrounding political chaos.
Okay; that changes my attitude a bit. But FHI’s core people are unlikely to produce the Scott Adams effect in response to strategic discussion. Do you or Wei think it’s reasonable for me to worry about that when discussing strategy in detail amongst, say, LWers — most of whom have far less understanding of the relevant issues (by virtue of not working on them every weeks for months or years)?
I agree that detailed exploration of Singularity strategies would alienate some LW-ers, and some in the SingInst fan base. It is possible that this is reason enough to avoid such discussion; my guess is that it is not, but I could easily be wrong here, and many think it is.
I was mostly responding to the [paraphrased] “we can’t discuss it publicly because it would take too long”, and “it wouldn’t work to create an informed set of strategists because there wouldn’t be a sense of progress”; I’ve said sentences like that before, and, when I said them, they were excuses/rationalizations. My actual reason was something like: “I’d like to avoid alienating people, and I’d like to avoid starting conflicts whose outcomes I cannot predict.”
It’ll alienate some SingInst-ers? That’s a troubling sign. Aren’t most SingInst-ers at least vaguely competent rationalists who are actually interested in Singularity options? Yet they will be alienated by mere theoretical exploration of the domain? What has your HR department been doing?
From a public relations viewpoint this sentence alone is worse than any particular detail could possible be. Because it not only allows, but forces people to imagine what horrible strategies you could possible explore and pursue. Strategies that are bad enough that you not only believe that even the community most closely related to SI would be alienated by them, but that you are also unable to support those explorations with rational arguments.
Personally I don’t want to contribute anything to an organisation which admits to explore strategies that are unacceptable by most people. And I wouldn’t suggest anyone else to do so. Yet I would neither be willing to to contribute if you were secretive about your strategic explorations. I just don’t trust you people, I never did. And I am still horrified by how people who actually believe that what you are saying is true and possible are willing to trust your small group blindly to shape the universe.
A paperclip maximizer is just a transformation of the universe into a state of almost no suffering. But a friendly AI that isn’t quite friendly, or one that is biased by the ideas of a small group of abnormal and psychopathic people, could increase negative utility dramatically.
No, I don’t agree with this. I predict that whatever strategies AnnaSalamon has in mind would alienate someone unless those strategies were very anodyne or vague. If the sample of listeners is big enough there will usually be someone to take issue with just about any idea one voices.
How true is that? In my case it just makes me try to imagine whether there are any strategies AnnaSalamon could propose that wouldn’t perturb anyone. When it comes to the singularity I draw a blank, as it’s a big enough issue that just about anything she or I or you could say about it will bother somebody.
I disagree that AS’s weak statement that “detailed exploration of Singularity strategies would alienate some LW-ers” tells you very much at all about the nature of those strategies. I expect most conceivable strategies would piss someone off, so I’d say her claim communicates less than 1 bit of information about those strategies.
Based on the rest of your comment I think you’ve read AnnaSalamon’s statement as one implying that SI’s strategies are unusually objectionable or alienating; maybe that’s what she meant but it doesn’t seem to be what she wrote.
Which is the right strategy. Humans are unfriendly. The group around AnnaSalamon is trying to take over and shape the universe according to their idea of what is right and good.
If you are making decisions based on the worst case scenario—as you are clearly doing when it comes to artificial intelligence, if you support friendly AI research—then you should do the same when it comes to human beings.
It isn’t enough to talk to them, to review their output and conclude that they are most likely friendly. Doing so and contributing money is aking to letting an AI, that is not provably friendly, out of the box. They either have to prove that they are friendly or make all their work transparent. Otherwise the right thing to do is to label them as terrorists and tell them to fuck off.
You could just as reasonably have written that comment if AnnaSalamon had never posted in this thread, though. My argument here isn’t with your broader attitude to FAI/SI, it’s that I think it’s unfair to pounce on a very low-information statement like “detailed exploration of Singularity strategies would alienate some LW-ers, and some SingInst-ers” and write it off as terrible PR that implies SI’s considering horrible strategies.
I think that it does convey quite a lot information. I already know that people associated with SI and LW accept a lot of strategic thinking that would be considered everything from absurd to outright psychopathic within different circles. If she says that the strategies they explore would even alienate some people associated with LW, let alone SI, then that’s really bad.
I think you underestimate the amount of information that a natural language sentence can carry and signal.
It is abundantly clear that SI is really bad at PR. I assign a high probability to the possibility that her and other members of the SI are revealing a lot of what is going on behind the scenes by being careless about their communication.
I disagree. LWers have a range of opinions on AI & the singularity (yes, those opinions are less diverse than the general population’s, but I don’t see them being sufficiently less diverse for your argument to go through). There are already quite a few LWers who’re SI sceptics to a degree. I’m also sure there are LWers who, at the moment, basically agree with SI but would spurn it if it announced a more specific strategy for handling AI/the singularity. I think this would be true for most possible strategies SI could announce. I’d expect the same basic argument to hold for SI (though I’m less sure because I know less about SI).
Quite possible! But in any case, a sentence can carry lots of information about one thing, but not another. One has to look at the probability of a sentence or claim conditional on a specific thing. As I see it, P(AS says some people would be alienated | SI has a terrible secret strategy) is about equal to P(AS says some people would be alienated | SI has an un-terrible secret strategy), so the likelihood ratio is about one, and AnnaSalamon’s belief discriminates poorly between those two particular hypotheses.
Plausible, but I doubt it’s true for this specific example.
If I was to accept your estimation then the associated utility of P(people alienated | terrible strategy) and P(people alienated | un-terrible strategy) would force you to act according to the first possibility.
I don’t follow. Do you mean that the potential disutility of SI having a terrible strategy is so much bigger than the potential utility of SI having an un-terrible strategy that, given equal likelihoods, I should act against SI? If so, I disagree.
Blah blah blah...full stop. We’re talking about the communication of primates with other primates. Evolution honed your skills to detect the intention and possible bullshit in the output of other primates. Use your intuition!
I am not sure what you are getting at. If she thinks that there are strategies that should be kept secrete for political reasons or whatever and admits it, that’s bad from any possible viewpoint.
I have. My gut didn’t raise a red flag when I read AnnaSalamon’s post, but it did when I read yours.
I was giving a reason for my claim that there’d be someone on LW/in SI who’d be alienated by all but the blandest of strategies.
Maybe she thinks that and maybe she doesn’t, but either way she didn’t admit it. (At least not in the post I’m talking about. I haven’t read AS’s whole comment history.)
To my intuitions you sound exactly like a bitter excluded nobody attacking someone successful and popular. You DON’T talk like someone who sees through the lies of an evil greedy deceiver and honestly wants people to examine what he says and come to the correct opinion.
I think the “mostly harmless” phrase still applies. These look like kids with firecrackers. The folk we should watch out for are more likely to be the Chinese, the military, hedge funds—and so on.
Maybe you can give an example of the kind of thing that you’re worried about? What might you say that could get you penalized for sounding crazy?
(Maybe we could take this discussion private; I’m also curious what kinds of questions these considerations apply to.)
Could get them penalized for sounding crazy? Those people believe into the possibility of heaven and hell and believe that merely thinking about decision and game theoretic conjectures might be dangerous.
Right, better to hide in your ivory tower only talking to people who agree with you. A perfect recipe to reinforce crazy ideas and amplify any biases.