ryan_greenblatt comments on Counting arguments provide no evidence for AI doom

ryan_greenblatt 29 Feb 2024 0:34 UTC
LW: 8 AF: 6
2
AF
I found the explanation at the point where you introduce $b$ confusing.

Here’s a revised version of the text there that would have been less confusing to me (assuming I haven’t made any errors):
- Complexity of simplest deceptive objective: $l + b$ where $l$ is the number of bits needed to select the part of the objective space which is just long term objectives and $b$ is the additional number of bits required to select the most simple long run objective. In other words $b$ is the minimum number of bits required to pick out a particular objective among all of the deceptive objects (aka the simplest one).
- We’re assuming that $a < l + b$ , but that $l < a$ . That is, the measure of any long run objective is higher than the measure on the (simplest) aligned objective.
- Casting into infinite bitstring land, we see that the set of aligned objectives includes those with anything after the first $a$ bits, whereas the set of deceptive objectives includes anything after the first $l$ bits (as all of these are long run objectives, though the differ). Even though you don’t get a full program until you’re $l + b$ bits deep, the complexity here is just $l$ , because all the bits after the first $l$ bits aren’t pinned down. So if we’re assuming that $l < a$ , then deception wins.
- evhub 29 Feb 2024 1:10 UTC
  LW: 4 AF: 4
  2
  AF Parent
  Yep, I endorse that text as being equivalent to what I wrote; sorry if my language was a bit confusing.