the world might be in such state that attempts to do good bring it into some failure instead, and doing the opposite is prevented by society (AI rise and blame-credit which rationality movement takes for it, perhaps?)
what if, for some numerical scale, the world would give you option “with 50%, double goodness score; otherwise, lose almost everything”? Maximizing EV on this is very dangerous...
if i left out the word ‘trying’ to (not) use it in that way instead, nothing about me would change, but there would be more comments saying that success is not certain.
i also disagree with the linked post[1], which says that ‘i will do x’ means one will set up a plan to achieve the highest probability of x they can manage. i think it instead usually means one believes they will do x with sufficiently high probability to not mention the chance of failure.[2] the post acknowledges the first half of this -- «Well, colloquially, “I’m going to flip the switch” and “I’m going to try to flip the switch” mean more or less the same thing, except that the latter expresses the possibility of failure.» -- but fails to integrate that something being said implies belief in its relevance/importance, and so concludes that using the word ‘try’ (or, by extrapolation, expressing the possibility of failure in general) is unnecessary in general.
But if all you want is to “maximize the probability of success using available resources”, then that’s the easiest thing in the world to convince yourself you’ve done.
I think the post was a deliberate attempt to overcome that psychology, the issue is you can get stuck in these loops of “trying to try” and convincing yourself that you did enough, this is tricky because it’s very easy to rationalise this part for feeling comfort.
When you set up for winning v/s try to set up for winning.
The latter is much easier to do than the former, and former still implies chance of failure but you actually try to do your best rather than, try to try to do your best.
I think this sounds convoluted, maybe there is a much easier cognitive algorithm to overcome this tendency.
nothing short of death can stop me from trying to do good.
the world could destroy or corrupt EA, but i’d remain an altruist.
it could imprison me, but i’d stay focused on alignment, as long as i could communicate to at least one on the outside.
even if it tried to kill me, i’d continue in the paths through time where i survived.
I upvoted because I imagine more people reading this would slightly nudge group norms in a direction that is positive.
But being cynical:
I’m sure you believe that this is true, but I doubt that it is literally true.
Signalling this position is very low risk when the community is already on board.
Trying to do good may be insufficient if your work on alignment ends up being dual use.
Never say ‘nothing’ :-)
the world might be in such state that attempts to do good bring it into some failure instead, and doing the opposite is prevented by society
(AI rise and blame-credit which rationality movement takes for it, perhaps?)
what if, for some numerical scale, the world would give you option “with 50%, double goodness score; otherwise, lose almost everything”? Maximizing EV on this is very dangerous...
“No! Try not! Do, or do not. There is no try.”
—Yoda
Trying to try
if i left out the word ‘trying’ to (not) use it in that way instead, nothing about me would change, but there would be more comments saying that success is not certain.
i also disagree with the linked post[1], which says that ‘i will do x’ means one will set up a plan to achieve the highest probability of x they can manage. i think it instead usually means one believes they will do x with sufficiently high probability to not mention the chance of failure.[2] the post acknowledges the first half of this -- «Well, colloquially, “I’m going to flip the switch” and “I’m going to try to flip the switch” mean more or less the same thing, except that the latter expresses the possibility of failure.» -- but fails to integrate that something being said implies belief in its relevance/importance, and so concludes that using the word ‘try’ (or, by extrapolation, expressing the possibility of failure in general) is unnecessary in general.
though its psychological point seems true:
this is why this wording is not used when the probability of success is sufficiently far (in percentage points, not logits) from guaranteed.
I think the post was a deliberate attempt to overcome that psychology, the issue is you can get stuck in these loops of “trying to try” and convincing yourself that you did enough, this is tricky because it’s very easy to rationalise this part for feeling comfort.
When you set up for winning v/s try to set up for winning.
The latter is much easier to do than the former, and former still implies chance of failure but you actually try to do your best rather than, try to try to do your best.
I think this sounds convoluted, maybe there is a much easier cognitive algorithm to overcome this tendency.