a directly coded AGI that runs “natively” on the TM who can decide to execute arbitrary algorithms “natively” on the TM.
At the end of the day, it will be running some subroutine for its gain trust/predict accurately phase.
I assume this sort of thing is true for any model of computation, but when you construct a universal Turing machine, so that it can simulate computation step after computation step of another Turing machine, it takes way more than one computation step for each one. If the AGI is using machinery that would allow it to simulate any world-model, it will be way slower than the Turing machine built for that algorithm.
I realize this seems really in-the-weeds and particular, but I think this is a general principle of computation. The more general a system is, the less well it can do any particular task. I think an AGI that chose to pipe viable predictions to the output with some procedure will be slower than the Turing machine which just runs that procedure.
If the AGI is using machinery that would allow it to simulate any world-model, it will be way slower than the Turing machine built for that algorithm.
I don’t buy it. All your programs are already running on UTM M.
Just consider a program that gives the aliens the ability to write arbitrary functions in M and then pass control to them. That program is barely any bigger (all you have to do is insert one use after free in physics :) ), and guarantees the aliens have zero slowdown.
For the literal simplest version of this, your program is M(Alien(), randomness), which is going to run just as fast as M(physics, randomness) for the intended physics, and probably much faster (if the aliens can think of any clever tricks to run faster without compromising much accuracy). The only reason you wouldn’t get this is if Alien is expensive. That probably rules out crazy alien civilizations, but I’m with Wei Dai that it probably doesn’t rule out simpler scientists.
Just consider a program that gives the aliens the ability to write arbitrary functions in M and then pass control to them.
That’s what I was thinking too, but Michael made me realize this isn’t possible, at least for some M. Suppose M is the C programming language, but in C there is no way to say “interpret this string as a C program and run it as fast as a native C program”. Am I missing something at this point?
all you have to do is insert one use after free in physics
That’s what I was thinking too, but Michael made me realize this isn’t possible, at least for some M. Suppose M is the C programming language, but in C there is no way to say “interpret this string as a C program and run it as fast as a native C program”. Am I missing something at this point?
I agree this is only going to be possible for some universal Turing machines. Though if you are using a Turing machine to define a speed prior, this does seem like a desirable property.
I don’t understand this sentence.
If physics is implemented in C, there are many possible bugs that would allow the attacker to execute arbitrary C code with no slowdown.
I agree this is only going to be possible for some universal Turing machines. Though if you are using a Turing machine to define a speed prior, this does seem like a desirable property.
Why is it a desirable property? I’m not seeing why it would be bad to choose a UTM that doesn’t have this property to define the speed prior for BoMAI, if that helps with safety. Please explain more?
I just mean: “universality” in the sense of a UTM isn’t a sufficient property when defining the speed prior, the analogous property of the UTM is something more like: “You can run an arbitrary Turing machine without too much slowdown.” Of course that’s not possible, but it seems like you still want to be as close to that as possible (for the same reasons that you wanted universality at all).
I agree that it would be fine to sacrifice this property if it was helpful for safety.
Each world-model is a Turing machine, whose prior relates to the Kolmogorov complexity (on some universal Turing machine) of the description of Turing machine—all the transition rules, and whatnot. Usually, this would be isomorphic (within a constant), but since we’re considering speed, programs actually aren’t simulated on a UTM.
Ok, I see, so in other words the AGI doesn’t have the ability to write an arbitrary function in the base programming language and call it, it has a fixed code base and has to simulate that function using its existing code. However I think the AGI can still win a race against a straightforward “predict accurately” algorithm, because it can to two things. 1) Include the most important inner loops of the “predict accurately” algorithm as functions in its own code to minimize the relative slowdown (this is not a decision by the AGI but just a matter of which AGI ends up having the highest posterior) and 2) keep finding improvements to its own prediction algorithm so that it can eventually overtake any fixed prediction algorithm in accuracy which hopefully more than “pays for” the remaining slowdown that is incurred.
Let the AGI’s “predict accurately” algorithm be fixed.
What you call a sequence of improvements to the prediction algorithm, let’s just call that the prediction algorithm. Imagine this to have as much or as little overhead as you like compared to what was previously conceptualized as “predict accurately.” I think this reconceptualization eliminates 2) as a concern, and if I’m understanding correctly, 1) is only able to mitigate slowdown, not overpower it.
Also I think 1) doesn’t work—maybe you came to this conclusion as well?
Suppose M is the C programming language, but in C there is no way to say “interpret this string as a C program and run it as fast as a native C program”.
But maybe you’re saying that doesn’t apply because:
(this is not a decision by the AGI but just a matter of which AGI ends up having the highest posterior)
I think this way throws off the contention that this AGI will have a short description length. One can imagine a sliding scale here. Short description, lots of overhead: a simple universe evolves life, aliens decide to run “predict accurately” + “treacherous turn”. Longer description, less overhead: an AGI that runs “predict accurately” + “treacherous turn.” Longer description, less overhead: an AGI with some of the subroutines involved already (conveniently) baked in to its architecture. Once all the subroutines are “baked into its architecture” you just have: the algorithm “predict accurately” + “treacherous turn”. And in this form, that has a longer description than just “predict accurately”.
I’ve made a case that the two endpoints in the trade-off are not problematic. I’ve argued (roughly) that one reduces computational overhead by doing things that dissociate the naturalness of describing “predict accurately” and “treacherous turn” all at once. This goes back to the general principle I proposed above: “The more general a system is, the less well it can do any particular task.” The only thing I feel like I can still do is argue against particular points in the trade-off that you think are likely to cause trouble. Can you point me to an exact inner loop that can be native to an AGI that would cause this to fall outside of this trend? To frame this case, the Turing machine description must specify [AGI + a routine that it can call]--sort of like a brain-computer interface, where the AGI is the brain and the fast routine is the computer.
At the end of the day, it will be running some subroutine for its gain trust/predict accurately phase.
I assume this sort of thing is true for any model of computation, but when you construct a universal Turing machine, so that it can simulate computation step after computation step of another Turing machine, it takes way more than one computation step for each one. If the AGI is using machinery that would allow it to simulate any world-model, it will be way slower than the Turing machine built for that algorithm.
I realize this seems really in-the-weeds and particular, but I think this is a general principle of computation. The more general a system is, the less well it can do any particular task. I think an AGI that chose to pipe viable predictions to the output with some procedure will be slower than the Turing machine which just runs that procedure.
I don’t buy it. All your programs are already running on UTM M.
Just consider a program that gives the aliens the ability to write arbitrary functions in M and then pass control to them. That program is barely any bigger (all you have to do is insert one use after free in physics :) ), and guarantees the aliens have zero slowdown.
For the literal simplest version of this, your program is M(Alien(), randomness), which is going to run just as fast as M(physics, randomness) for the intended physics, and probably much faster (if the aliens can think of any clever tricks to run faster without compromising much accuracy). The only reason you wouldn’t get this is if Alien is expensive. That probably rules out crazy alien civilizations, but I’m with Wei Dai that it probably doesn’t rule out simpler scientists.
That’s what I was thinking too, but Michael made me realize this isn’t possible, at least for some M. Suppose M is the C programming language, but in C there is no way to say “interpret this string as a C program and run it as fast as a native C program”. Am I missing something at this point?
I don’t understand this sentence.
I agree this is only going to be possible for some universal Turing machines. Though if you are using a Turing machine to define a speed prior, this does seem like a desirable property.
If physics is implemented in C, there are many possible bugs that would allow the attacker to execute arbitrary C code with no slowdown.
Why is it a desirable property? I’m not seeing why it would be bad to choose a UTM that doesn’t have this property to define the speed prior for BoMAI, if that helps with safety. Please explain more?
I just mean: “universality” in the sense of a UTM isn’t a sufficient property when defining the speed prior, the analogous property of the UTM is something more like: “You can run an arbitrary Turing machine without too much slowdown.” Of course that’s not possible, but it seems like you still want to be as close to that as possible (for the same reasons that you wanted universality at all).
I agree that it would be fine to sacrifice this property if it was helpful for safety.
Each world-model is a Turing machine, whose prior relates to the Kolmogorov complexity (on some universal Turing machine) of the description of Turing machine—all the transition rules, and whatnot. Usually, this would be isomorphic (within a constant), but since we’re considering speed, programs actually aren’t simulated on a UTM.
Ok, I see, so in other words the AGI doesn’t have the ability to write an arbitrary function in the base programming language and call it, it has a fixed code base and has to simulate that function using its existing code. However I think the AGI can still win a race against a straightforward “predict accurately” algorithm, because it can to two things. 1) Include the most important inner loops of the “predict accurately” algorithm as functions in its own code to minimize the relative slowdown (this is not a decision by the AGI but just a matter of which AGI ends up having the highest posterior) and 2) keep finding improvements to its own prediction algorithm so that it can eventually overtake any fixed prediction algorithm in accuracy which hopefully more than “pays for” the remaining slowdown that is incurred.
Let the AGI’s “predict accurately” algorithm be fixed.
What you call a sequence of improvements to the prediction algorithm, let’s just call that the prediction algorithm. Imagine this to have as much or as little overhead as you like compared to what was previously conceptualized as “predict accurately.” I think this reconceptualization eliminates 2) as a concern, and if I’m understanding correctly, 1) is only able to mitigate slowdown, not overpower it.
Also I think 1) doesn’t work—maybe you came to this conclusion as well?
But maybe you’re saying that doesn’t apply because:
I think this way throws off the contention that this AGI will have a short description length. One can imagine a sliding scale here. Short description, lots of overhead: a simple universe evolves life, aliens decide to run “predict accurately” + “treacherous turn”. Longer description, less overhead: an AGI that runs “predict accurately” + “treacherous turn.” Longer description, less overhead: an AGI with some of the subroutines involved already (conveniently) baked in to its architecture. Once all the subroutines are “baked into its architecture” you just have: the algorithm “predict accurately” + “treacherous turn”. And in this form, that has a longer description than just “predict accurately”.
You only have to bake in the innermost part of one loop in order to get almost all the computational savings.
I’ve made a case that the two endpoints in the trade-off are not problematic. I’ve argued (roughly) that one reduces computational overhead by doing things that dissociate the naturalness of describing “predict accurately” and “treacherous turn” all at once. This goes back to the general principle I proposed above: “The more general a system is, the less well it can do any particular task.” The only thing I feel like I can still do is argue against particular points in the trade-off that you think are likely to cause trouble. Can you point me to an exact inner loop that can be native to an AGI that would cause this to fall outside of this trend? To frame this case, the Turing machine description must specify [AGI + a routine that it can call]--sort of like a brain-computer interface, where the AGI is the brain and the fast routine is the computer.
(I actually have a more basic confusion, started a new thread.)