Turing completeness regards only the functional behavior of a class of computational systems. I want to look at the internals, what the system is actually doing, and find abstractions in there: Modularity, search processes, and steering mechanisms for instance.
So it’s not about finding yet another framework whose expressiveness is equivalent to Turing completeness. It’s about finding a framework to express the actual computation.
The internals of a system of course determine its functional behavior. But there might be different systems that differ only in what they actually do. E.g. different sort algorithms all end up with a sorted list but sort it differently. Likewise, a pathfinding algorithm like Dijkstra is different than checking every possible path and picking the best one.
Looking only at functional behavior strips you of your ability to make predictions. You only know what has already happened. You can’t generalize to new inputs.
This is the actual crux of why we care about the internals. We don’t know the functional behavior of a NN except by executing it (There are some Interpretability tools but not sufficiently so). We want to understand what a NN will do before executing it.
Let’s put this in the context of an AGI: We have a giant model which is executed on multiple GPUs. Ideally, we want to know that it won’t kill us without trying to run it. If we would have a method to find ‘search processes’ and similar things going on in its brain, then we could see if it searches for things like ‘how can I disempower humanity?’.
Thanks, that clarifies your aims a lot. Did you gave some thoughts on how your approach would deal with cases of embodied cognition and uses of external memories?
Turing completeness regards only the functional behavior of a class of computational systems. I want to look at the internals, what the system is actually doing, and find abstractions in there: Modularity, search processes, and steering mechanisms for instance.
So it’s not about finding yet another framework whose expressiveness is equivalent to Turing completeness. It’s about finding a framework to express the actual computation.
In what sense is the functional behavior different from the internals/actual computations? Could you provide a few toy examples?
The internals of a system of course determine its functional behavior. But there might be different systems that differ only in what they actually do. E.g. different sort algorithms all end up with a sorted list but sort it differently. Likewise, a pathfinding algorithm like Dijkstra is different than checking every possible path and picking the best one.
Looking only at functional behavior strips you of your ability to make predictions. You only know what has already happened. You can’t generalize to new inputs.
This is the actual crux of why we care about the internals. We don’t know the functional behavior of a NN except by executing it (There are some Interpretability tools but not sufficiently so). We want to understand what a NN will do before executing it.
Let’s put this in the context of an AGI: We have a giant model which is executed on multiple GPUs. Ideally, we want to know that it won’t kill us without trying to run it. If we would have a method to find ‘search processes’ and similar things going on in its brain, then we could see if it searches for things like ‘how can I disempower humanity?’.
Thanks, that clarifies your aims a lot. Did you gave some thoughts on how your approach would deal with cases of embodied cognition and uses of external memories?