Stuart_Armstrong comments on Defining a limited satisficer

Stuart_Armstrong 12 Mar 2015 11:08 UTC
1 point
See my edit above. “would use everything it has to...” is the kind of behaviour we want to avoid. So I’m more following the sastisficing intuition than the formal definition. I can justify this by going meta: when people design/imagine satisficers, they generally look around at the problem, see what can be achieved, how hard it is, etc… and then set the threshold. I want to automate “set a reasonable threshold” as well as “be a reasonable satisficer” in order to achieve “don’t have a huge impact on the world”.