I’m really enjoying all these posts, thanks a lot!
Thank you for saying that. :)
Wouldn’t it be simpler to say that righteous indignation is a rewarding feeling (in the moment) and we’re motivated to think thoughts that bring about that feeling?
Well, in my model there are two layers:
1) first, the anger is produced by a subsystem which is optimizing for some particular goal
2) if that anger looks like it is achieving the intended goal, then positive valence is produced as a result; that is experienced as a rewarding feeling that e.g. craving may grab hold of and seek to maintain
That said, the exact reason for why anger is produced isn’t really important for the example and might just be unnecessarily distracting, so I’ll remove it.
Agreed, and this is one of the reasons that I think normal intuitions about how agents behave don’t necessarily carry over to self-modifying agents whose subagents can launch direct attacks against each other, see here.
Agreed.
Yeah, just like every other subsystem right?
Yep! Well, some subsystems seem to do actual forward planning as well, but of course that planning is based on cached models.
Thank you for saying that. :)
Well, in my model there are two layers:
1) first, the anger is produced by a subsystem which is optimizing for some particular goal
2) if that anger looks like it is achieving the intended goal, then positive valence is produced as a result; that is experienced as a rewarding feeling that e.g. craving may grab hold of and seek to maintain
That said, the exact reason for why anger is produced isn’t really important for the example and might just be unnecessarily distracting, so I’ll remove it.
Agreed.
Yep! Well, some subsystems seem to do actual forward planning as well, but of course that planning is based on cached models.