I don’t see how. I think an EDT agent would make the decision by simulating (or doing some analysis that’s equivalent to this) a bunch of worlds, then look at the worlds where it or agents like it happened to make the message benign/malign to see what the humans do in those worlds, and it would see no correlation between its decision and what the humans do and therefore end up making the message malign.
Actually, I’m not sure whether you mean “friendly” in the sense of FAI or the conventional usage.
By “unfriendly” I meant that running the alien AI results in something as bad as extinction. So my point was that if P(running alien AI results in something as bad as extinction) > 1% then this risk would more than cancel out the expected gain of 1% of our future light cone from running the alien AI (conditional on alien colonization being as good as human colonization), and I don’t see how we can get this probability to be less than 1%.
I don’t see how. I think an EDT agent would make the decision by simulating (or doing some analysis that’s equivalent to this) a bunch of worlds, then look at the worlds where it or agents like it happened to make the message benign/malign to see what the humans do in those worlds, and it would see no correlation between its decision and what the humans do and therefore end up making the message malign.
By “unfriendly” I meant that running the alien AI results in something as bad as extinction. So my point was that if P(running alien AI results in something as bad as extinction) > 1% then this risk would more than cancel out the expected gain of 1% of our future light cone from running the alien AI (conditional on alien colonization being as good as human colonization), and I don’t see how we can get this probability to be less than 1%.