One potential reason why you might have inferred that I was is because my credence for scheming is so high, relative to what you might have thought given my other claim about “serious misalignment”. My explanation here is that I tend to interpret “AI scheming” to be a relatively benign behavior, in context. If we define scheming as:
behavior intended to achieve some long-tern objective that is not quite what the designers had in mind
not being fully honest with the designers about its true long-term objectives (especially in the sense of describing accurately what it would do with unlimited power)
then I think scheming is ubiquitous and usually relatively benign, when performed by rational agents without godlike powers. For example, humans likely “scheme” all the time by (1) pursuing long-term plans, and (2) not being fully honest to others about what they would do if they became god. This is usually not a big issue because agents don’t generally get the chance to take over the world and do a treacherous turn; instead, they have to play the game of compromise and trade like the rest of us, along with all the other scheming AIs, who have different long-term goals.
I’m not conditioning on prior claims.
One potential reason why you might have inferred that I was is because my credence for scheming is so high, relative to what you might have thought given my other claim about “serious misalignment”. My explanation here is that I tend to interpret “AI scheming” to be a relatively benign behavior, in context. If we define scheming as:
behavior intended to achieve some long-tern objective that is not quite what the designers had in mind
not being fully honest with the designers about its true long-term objectives (especially in the sense of describing accurately what it would do with unlimited power)
then I think scheming is ubiquitous and usually relatively benign, when performed by rational agents without godlike powers. For example, humans likely “scheme” all the time by (1) pursuing long-term plans, and (2) not being fully honest to others about what they would do if they became god. This is usually not a big issue because agents don’t generally get the chance to take over the world and do a treacherous turn; instead, they have to play the game of compromise and trade like the rest of us, along with all the other scheming AIs, who have different long-term goals.