The solution to the “large overhead” problem is to amortize the cost of the human simulation over a large number of English sentences and predictions.
That seems a fair approach in general, like how can we use the program efficiently/profitably, but I don’t think it answers the question in OP. I think it actually actually implies the opposite effect: as you go through more layers of abstraction you get more and more complex (i.e. simplicity doesn’t hold across layers of abstraction). That’s why the strategy you mention needs to be over ever larger and larger problem spaces to make sense.
So this would still mean most of our reasoning about Occam’s Razor wouldn’t apply to SI.
A short English sentence then adds only a small amount of marginal complexity to the program—i.e. adding one more sentence (and corresponding predictions) only adds a short string to the program.
I’m not sure we (humanity) know enough to claim only a short string needs to be added. I think GPT-3 hints at a counter-example b/c GTP has been growing geometrically.
Moreover, I don’t think we have any programs or ideas for programs that are anywhere near sophisticated enough to answer meaningful Qs—unless they just regurgitate an answer. So we don’t have a good reason to claim to know what we’ll need to add to extend your solution to handle more and more cases (especially increasingly technical/sophisticated cases).
Intuitively I think there is (physically) a way to do something like what you describe efficiently because humans are an example of this—we have no known limit for understanding new ideas. However, it’s not okay to use this as a hypothetical SI program b/c such a program does other stuff we don’t know how to do with SI programs (like taking into account itself, other actors, and the universe broadly).
If the hypothetical program does stuff we don’t understand and we also don’t understand its data encoding methods, then I don’t think we can make claims about how much data we’d need to add.
I think it’s reasonable there would be no upper limit on the amount of data we’d need to add to such a program as we input increasingly sophisticated questions. I also think it’s intuitive there’s no upper limit on this data requirement (for both people and the hypothetical programs you mention).
That seems a fair approach in general, like how can we use the program efficiently/profitably, but I don’t think it answers the question in OP. I think it actually actually implies the opposite effect: as you go through more layers of abstraction you get more and more complex (i.e. simplicity doesn’t hold across layers of abstraction). That’s why the strategy you mention needs to be over ever larger and larger problem spaces to make sense.
So this would still mean most of our reasoning about Occam’s Razor wouldn’t apply to SI.
I’m not sure we (humanity) know enough to claim only a short string needs to be added. I think GPT-3 hints at a counter-example b/c GTP has been growing geometrically.
Moreover, I don’t think we have any programs or ideas for programs that are anywhere near sophisticated enough to answer meaningful Qs—unless they just regurgitate an answer. So we don’t have a good reason to claim to know what we’ll need to add to extend your solution to handle more and more cases (especially increasingly technical/sophisticated cases).
Intuitively I think there is (physically) a way to do something like what you describe efficiently because humans are an example of this—we have no known limit for understanding new ideas. However, it’s not okay to use this as a hypothetical SI program b/c such a program does other stuff we don’t know how to do with SI programs (like taking into account itself, other actors, and the universe broadly).
If the hypothetical program does stuff we don’t understand and we also don’t understand its data encoding methods, then I don’t think we can make claims about how much data we’d need to add.
I think it’s reasonable there would be no upper limit on the amount of data we’d need to add to such a program as we input increasingly sophisticated questions. I also think it’s intuitive there’s no upper limit on this data requirement (for both people and the hypothetical programs you mention).