TurnTrout comments on TurnTrout’s shortform feed

TurnTrout 26 Dec 2023 21:20 UTC
LW: 6 AF: 4
0
AF
Note that “LLMs are evidence against this hypothesis” isn’t my main point here. The main claim is that the positive arguments for deceptive alignment are flimsy, and thus the prior is very low.