I can’t say I understand exactly what you’re looking for here, but generally speaking there’s not going to be one true underlying framework for computation. That’s the point of Turing completeness: there are many different equivalent ways to express computation. This is the norm in math as well, e.g. with many different equivalent ways to define e, as well as in mathematical foundations, so the foundation you learn in school (for me it was ZFC, a set theory foundation) is not necessarily the same as you use for computer-checked formal proofs (e.g. Coq uses a type theory foundation).
Turing completeness regards only the functional behavior of a class of computational systems. I want to look at the internals, what the system is actually doing, and find abstractions in there: Modularity, search processes, and steering mechanisms for instance.
So it’s not about finding yet another framework whose expressiveness is equivalent to Turing completeness. It’s about finding a framework to express the actual computation.
The internals of a system of course determine its functional behavior. But there might be different systems that differ only in what they actually do. E.g. different sort algorithms all end up with a sorted list but sort it differently. Likewise, a pathfinding algorithm like Dijkstra is different than checking every possible path and picking the best one.
Looking only at functional behavior strips you of your ability to make predictions. You only know what has already happened. You can’t generalize to new inputs.
This is the actual crux of why we care about the internals. We don’t know the functional behavior of a NN except by executing it (There are some Interpretability tools but not sufficiently so). We want to understand what a NN will do before executing it.
Let’s put this in the context of an AGI: We have a giant model which is executed on multiple GPUs. Ideally, we want to know that it won’t kill us without trying to run it. If we would have a method to find ‘search processes’ and similar things going on in its brain, then we could see if it searches for things like ‘how can I disempower humanity?’.
Thanks, that clarifies your aims a lot. Did you gave some thoughts on how your approach would deal with cases of embodied cognition and uses of external memories?
I can’t say I understand exactly what you’re looking for here, but generally speaking there’s not going to be one true underlying framework for computation. That’s the point of Turing completeness: there are many different equivalent ways to express computation. This is the norm in math as well, e.g. with many different equivalent ways to define e, as well as in mathematical foundations, so the foundation you learn in school (for me it was ZFC, a set theory foundation) is not necessarily the same as you use for computer-checked formal proofs (e.g. Coq uses a type theory foundation).
Turing completeness regards only the functional behavior of a class of computational systems. I want to look at the internals, what the system is actually doing, and find abstractions in there: Modularity, search processes, and steering mechanisms for instance.
So it’s not about finding yet another framework whose expressiveness is equivalent to Turing completeness. It’s about finding a framework to express the actual computation.
In what sense is the functional behavior different from the internals/actual computations? Could you provide a few toy examples?
The internals of a system of course determine its functional behavior. But there might be different systems that differ only in what they actually do. E.g. different sort algorithms all end up with a sorted list but sort it differently. Likewise, a pathfinding algorithm like Dijkstra is different than checking every possible path and picking the best one.
Looking only at functional behavior strips you of your ability to make predictions. You only know what has already happened. You can’t generalize to new inputs.
This is the actual crux of why we care about the internals. We don’t know the functional behavior of a NN except by executing it (There are some Interpretability tools but not sufficiently so). We want to understand what a NN will do before executing it.
Let’s put this in the context of an AGI: We have a giant model which is executed on multiple GPUs. Ideally, we want to know that it won’t kill us without trying to run it. If we would have a method to find ‘search processes’ and similar things going on in its brain, then we could see if it searches for things like ‘how can I disempower humanity?’.
Thanks, that clarifies your aims a lot. Did you gave some thoughts on how your approach would deal with cases of embodied cognition and uses of external memories?