Hm. This is an intriguing point. I thought by “maximize the actual outcome according to its own criteria of optimality” you meant U, which is my understanding of what an Oracle would do, but instead you meant it would produce plans so as to maximize P, rather than producing plans that would maximize P if implemented, is that about right?
I guess you’d have to produce some list of plans such that each would produce high value for P if selected (which includes an expectation that they would be successfully implemented if selected), given that they appear on the list and all the other plans do as well… you wouldn’t necessarily have to worry about other influences the plan list might have, would you?
Perhaps if we had a more concrete example:
Suppose we ask the AI to advise us on building a sturdy bridge over some river (valuing both sturdiness and bridgeness, probably other things like speed of building, etc.). Stuart_Armstrong’s version would select a list of plans such that given that the operators will view that list, if they select one of the plans, then the AI predicts that they will successfully build a sturdy bridge (or that a sturdy bridge will otherwise come into being). I admit I find the subject a little confusing, but does that sound about right?
Hm. This is an intriguing point. I thought by “maximize the actual outcome according to its own criteria of optimality” you meant U, which is my understanding of what an Oracle would do, but instead you meant it would produce plans so as to maximize P, rather than producing plans that would maximize P if implemented, is that about right?
I guess you’d have to produce some list of plans such that each would produce high value for P if selected (which includes an expectation that they would be successfully implemented if selected), given that they appear on the list and all the other plans do as well… you wouldn’t necessarily have to worry about other influences the plan list might have, would you?
Perhaps if we had a more concrete example:
Suppose we ask the AI to advise us on building a sturdy bridge over some river (valuing both sturdiness and bridgeness, probably other things like speed of building, etc.). Stuart_Armstrong’s version would select a list of plans such that given that the operators will view that list, if they select one of the plans, then the AI predicts that they will successfully build a sturdy bridge (or that a sturdy bridge will otherwise come into being). I admit I find the subject a little confusing, but does that sound about right?