The C does induct in a non-trivial way, the result is friendly, but only one or two steps of the induction are actually needed.
I’m curious what you’re imagining here—I don’t really know why this would happen or what it would look like. Is it something like “this agent makes a successor that is fully friendly and powerful given resource constraints”?
I’m thinking something like “this utility function is friendly, once we have solved these n specific problems; let’s create a few levels of higher intelligence, to solve these specific problems, using certain constraints (physical or motivational) to prevent things going wrong during this process”.
I’m curious what you’re imagining here—I don’t really know why this would happen or what it would look like. Is it something like “this agent makes a successor that is fully friendly and powerful given resource constraints”?
I’m thinking something like “this utility function is friendly, once we have solved these n specific problems; let’s create a few levels of higher intelligence, to solve these specific problems, using certain constraints (physical or motivational) to prevent things going wrong during this process”.