I don’t quite follow this. Assuming we’re using one of the universal priors based on Turing machine enumerations, then an agent which consists of 3^^^3threat+noability is much shorter and much more likely than an agent which consists of ~.10*3^^^3threat+ability. The more complex the threat, the less space there is for executing it.
If I disagree, it’s for a very minor reason, and with only a little confidence. (P(threat) is short for P(threat|no information about ability).) But you’re saying the case for P(threaten H) being bounded below (and its reciprocal being bounded above) is even stronger than I thought, right?
Another way to argue that P(threaten H) should be medium-sized: at least in real life, muggings have a time-limit. There are finitely many threats of a hundred words or less, and so our prior probability that we will one day receive such a threat is bounded below.
Another way to argue that the real issue is P(ability H): our AI might single you out and compute P(gwern will do H harm) = P(gwern will do H harm | gwern can do H harm) x P(gwern can do H harm). It seems like you have an interest in convincing the AI that P(gwern can do H harm) x H is bounded above.
I don’t quite follow this. Assuming we’re using one of the universal priors based on Turing machine enumerations, then an agent which consists of 3^^^3threat+noability is much shorter and much more likely than an agent which consists of ~.10*3^^^3threat+ability. The more complex the threat, the less space there is for executing it.
If I disagree, it’s for a very minor reason, and with only a little confidence. (P(threat) is short for P(threat|no information about ability).) But you’re saying the case for P(threaten H) being bounded below (and its reciprocal being bounded above) is even stronger than I thought, right?
Another way to argue that P(threaten H) should be medium-sized: at least in real life, muggings have a time-limit. There are finitely many threats of a hundred words or less, and so our prior probability that we will one day receive such a threat is bounded below.
Another way to argue that the real issue is P(ability H): our AI might single you out and compute P(gwern will do H harm) = P(gwern will do H harm | gwern can do H harm) x P(gwern can do H harm). It seems like you have an interest in convincing the AI that P(gwern can do H harm) x H is bounded above.