I low-confidence think the context strengthens my initial impression. Paul prefaced the above quote as “maybe the simplest [reason for AIs to learn to behave well during training, but then when deployed or when there’s an opportunity for takeover, they stop behaving well].” This doesn’t make sense to me, but I historically haven’t understood Paul very well.
I low-confidence think the context strengthens my initial impression. Paul prefaced the above quote as “maybe the simplest [reason for AIs to learn to behave well during training, but then when deployed or when there’s an opportunity for takeover, they stop behaving well].” This doesn’t make sense to me, but I historically haven’t understood Paul very well.
EDIT: Hedging