Do you think this might be a significant obstacle in the future? For example, do you think it is likely that the algorithms inside of an AGI-neural-network built by SGD will be so complicated that they are not humanly understandable, because of their sheer size? I am especially thinking about the case where an algorithm exists that is just as capable but understandable.
This seems more likely if we end up with an AGI-neural-network that mushes together the world model and the algorithms that use the world model (e.g. update it, use it to plan), such that there are no clear boundaries. If the AGI is really good at manipulating the world, it probably has a pretty good model of the world. As the world contains a lot of algorithmic information, the AGI’s model of the world will be complex. If the system is mushed we might need to understand all that complexity to an intractable extent.
I expect that if you can have a system where the world model is factored out into its own module, it will be easier to handle the complexity in the world because then we can infer properties of the world model based on the algorithms that construct and use it. I expect the world model will still be very complex, and the algorithms that construct and use it will be simple. Therefore infering properties of the world model based on these simple algorithms might still be tractable.
Do you think this problem is likely to show up in the future?
Upon reflection, I’m unsure what you mean by the program being simpler. What is your preferred way to represent modular addition? I could of course write down 20 % 11. I know exactly what that means. But first of all, this is not an algorithm. It just talks about the concept of modular arithmetic without specifying how to compute it. And understanding the concept at a high level is of course easier than representing the entire algorithm all at once in my mind.
I guess the normal way you would compute the modulo would be to take a number a and then subtract b from it until what is left is smaller than b. What is left is then the modulo. Ok, that seems simpler so never mind.
It does seem an important distinction to think about the way we represent a concept and the actual computation associated with obtaining the results associated with that concept. I got confused because I was conflating these two things.
Do you think this might be a significant obstacle in the future? For example, do you think it is likely that the algorithms inside of an AGI-neural-network built by SGD will be so complicated that they are not humanly understandable, because of their sheer size? I am especially thinking about the case where an algorithm exists that is just as capable but understandable.
This seems more likely if we end up with an AGI-neural-network that mushes together the world model and the algorithms that use the world model (e.g. update it, use it to plan), such that there are no clear boundaries. If the AGI is really good at manipulating the world, it probably has a pretty good model of the world. As the world contains a lot of algorithmic information, the AGI’s model of the world will be complex. If the system is mushed we might need to understand all that complexity to an intractable extent.
I expect that if you can have a system where the world model is factored out into its own module, it will be easier to handle the complexity in the world because then we can infer properties of the world model based on the algorithms that construct and use it. I expect the world model will still be very complex, and the algorithms that construct and use it will be simple. Therefore infering properties of the world model based on these simple algorithms might still be tractable.
Do you think this problem is likely to show up in the future?
Upon reflection, I’m unsure what you mean by the program being simpler. What is your preferred way to represent modular addition? I could of course write down
20 % 11
. I know exactly what that means. But first of all, this is not an algorithm. It just talks about the concept of modular arithmetic without specifying how to compute it. And understanding the concept at a high level is of course easier than representing the entire algorithm all at once in my mind.I guess the normal way you would compute the modulo would be to take a number a and then subtract b from it until what is left is smaller than b. What is left is then the modulo. Ok, that seems simpler so never mind.
It does seem an important distinction to think about the way we represent a concept and the actual computation associated with obtaining the results associated with that concept. I got confused because I was conflating these two things.