Did you accidentally forget to add this post to your research journal sequence?
Here my quick reactions on many of the points in the post:
optimization algorithms (finitely terminating)
iterative methods (convergent)
That sounds as if as if they are always finitely terminating or convergent, which they’re not. (I don’t think you wanted to say they are)
Computational optimization can learn but cannot be divided. It can compute all computable functions (Turing machine or a human with pen/paper). However, if you break up the cognitive processing parts, no computation will take place.
I don’t quite understand this. What does the sentence “computational optimization can compute all computable functions” mean? Additionally, in my conception of “computational optimization” (which is admittedly rather vague), learning need not take place.
The structure of deep learning mimics the structure of intelligence as path finding through world states
I find these analogies and your explanations a bit vague. What makes it hard for me to judge what’s behind these analogies:
You write “Intelligence = Mapping current world state to target world state (or target direction)”:
these two options are conceptually quite different and might influence the meaning of the analogy. If intelligence computes only a “target direction”, then this corresponds to a heuristic approach in which locally, the correct direction in action space is chosen. However, if you view intelligence as an actual optimization algorithm, then what’s chosen is not only a direction but a whole path.
Further nitpick: I wouldn’t use the verb “to map” here. I think you mean more something like “to transform”, especially if you mean the optimization viewpoint.
You write “Learning consists of setting the right weights between all the neurons in all the layers. This is analogous to my understanding of human intelligence as path-finding through reality”
Learning is a thing you do once, and then you use the resulting neural network repeatedly. In contrast, if you search for a path, you usually use that path only once.
The output of a neural network can be a found path itself. That makes the analogy even more difficult to me.
Is human imagination and “thinking through different ways past events might have gone” a form of data augmentation? We perturb a memory and then project out how we would have felt and what we would have wanted to do. This seems quite similar to using simulation to generate and improve predictions.
Off-policy reinforcement learning is built on this idea. One famous example is DQN, which uses experience replay. The paper is still worth reading today; some consider it the start of deep RL.
Our utility function is encoded in our neural activity.
I think the terms “our utility function” and “encoded” are not well-defined enough to be able to outright say whether this is true or not, but under a reasonable interpretation of the terms, it seems correct to me.
An aligned AGI is one that has learned the function that maps our neurally encoded utility function to observable world states.
I do not know what you mean by “mapping a utility function to world states”. Is the following a correct paraphrasing of what you mean?
“An aligned AGI is one that tries to steer toward world states such that the neurally encoded utility function, if queried, would say ‘these states are rather optimal’ ”
Thus the most reliable signal of the human utility function is either:
Aggregation over a large enough sample that all the noise is cancelled out
Direct biological measures of our utility function
However, who says there are no systematic biases and errors in our behavior that do not cancel out over large samples?
There are indeed biases in our decision-making that mean that the utility function cannot be inferred from our behavior alone, as shown in humans can be assigned any values whatsoever.
I also don’t think it’s feasible to directly measure our utility function. In my own view, our utility function isn’t an observable thing. There might be a utility function that gets revealed by running history far into the future and observing on what humans converge on, but I don’t think the end result of what we value can be directly measured in our brains.
Specifically, humans gain utility directly from various stimuli and observations like eating sweet food or looking at puppies.
We cannot gain “utility”. We can only gain “reward”. Utility is a measure of world states, whereas reward is a thing happening in our brains.
Instead, we have many (scarcely known) hyperparameters where the utility we get from our observations comes from the transformation and evaluation of one or many sets of observations. For instance, the satisfaction of a job well-done relies on observing the entire process and then evaluating the end result as good. Similarly, many observations that consist of directly negative stimuli (parameters) are evaluated as positive by some hyperparameter such as the meaningfulness of childbirth or the beautiful release of a funeral.
I don’t quite understand the analogy to hyperparameters here. To me, it seems like childbirth’s meaning is in itself a reward that, by credit assignment, leads to a positive evaluation of the actions that led to it, even though in the experience the reward was mostly negative. It is indeed interesting figuring out what exactly is going on here (and the shard theory of human values might be an interesting frame for that, see also this interesting post looking at how the same external events can trigger different value updates), but I don’t yet see how it connects to hyperparameters.
but it’s much less clear how we’d map the ever-fickle hyperparameters of our utility function that entirely hinge on our evaluations and transformations we ourselves apply to our experiences … it’s a value we compute internally that would require the AGI to simulate us as full-bodied beings to get the exact same result.
What if instead of trying to build an AI that tries to decode our brain’s utility function, we build the process that created our values in the first place and expose the AI to this process?
The counterargument would be that language models lack grounding in reality.
The distinction between vision and language models breaks down with things like vision transformers. But in general, the lack of grounding of pure language models seems a problem to me for reaching AGI with it. But I think a language model that interacts with the world through, e.g., an internet connection, might already get rid of this grounding problem.
Self-supervised learning is the default form of learning for individual agents embedded in reality.
That seems pretty plausible to me for achieving AGI, but many RL agents do not have an explicit self-supervised component.
Feature engineering seems like a form of pre-processing, and thus not a relevant concept for AGI? We’d expect AGI to learn it’s own features. Which is what kernels in convolutional neural networks do, for instance.
I mostly agree. But note that feature engineering is just a form of an inductive prior, and it’s not possible to get rid of those and let them be “learned”—there is no free lunch.
Overfitting in a neural network is basically memorizing the data set.
Many models that do not overfit also memorize much of the data set.
To solve problems, researchers may use algorithms that terminate in a finite number of steps, or iterative methods that converge to a solution (on some specified class of problems), or heuristics that may provide approximate solutions to some problems (although their iterates need not converge).
I don’t quite understand this. What does the sentence “computational optimization can compute all computable functions” mean? Additionally, in my conception of “computational optimization” (which is admittedly rather vague), learning need not take place.
I might have overloaded the phrase “computational” here. My intention was to point out what can be encoded by such a system. Maybe “coding” is a better word? E.g., neural coding. These systems can implement Turing machines so can potentially have the same properties of turing machines.
these two options are conceptually quite different and might influence the meaning of the analogy. If intelligence computes only a “target direction”, then this corresponds to a heuristic approach in which locally, the correct direction in action space is chosen. However, if you view intelligence as an actual optimization algorithm, then what’s chosen is not only a direction but a whole path.
I’m wondering if our disagreement is conceptual or semantic. Optimizing a direction instead of an entire path is just a difference in time horizon in my model. But maybe this is a different use of the word “optimize”?
You write “Learning consists of setting the right weights between all the neurons in all the layers. This is analogous to my understanding of human intelligence as path-finding through reality”
Learning is a thing you do once, and then you use the resulting neural network repeatedly. In contrast, if you search for a path, you usually use that path only once.
If I learn the optimal path to work, then I can use that multiple times. I’m not sure I agree with the distinction you are drawing here … Some problems in life only need to be solved exactly once, but that’s the same as any thing you learn only being applicable once. I didn’t mean to claim the processes are identical, but that they share an underlying structure. Though indeed, this might an empty intuitive leap with no useful implementation. Or maybe not a good matching at all.
I do not know what you mean by “mapping a utility function to world states”. Is the following a correct paraphrasing of what you mean?
“An aligned AGI is one that tries to steer toward world states such that the neurally encoded utility function, if queried, would say ‘these states are rather optimal’ ”
Yes, thank you.
I don’t quite understand the analogy to hyperparameters here. To me, it seems like childbirth’s meaning is in itself a reward that, by credit assignment, leads to a positive evaluation of the actions that led to it, even though in the experience the reward was mostly negative. It is indeed interesting figuring out what exactly is going on here (and the shard theory of human values might be an interesting frame for that, see also this interesting post looking at how the same external events can trigger different value updates), but I don’t yet see how it connects to hyperparameters.
A hyperparameter is a parameter across parameters. So say with childbirth, you have a parameter pain on physical pain which is a direct physical signal, and you have a hyperparameter ‘Satisfaction from hard work’ that takes ‘pain’ as input as well as some evaluative cognitive process and outputs reward accordingly. Does that make sense?
What if instead of trying to build an AI that tries to decode our brain’s utility function, we build the process that created our values in the first place and expose the AI to this process?
Digging in to shard theory is still on my todo list. [bookmarked]
Many models that do not overfit also memorize much of the data set.
Is this on the sweet spot just before overfitting or should I be thinking of something else?
I might have overloaded the phrase “computational” here. My intention was to point out what can be encoded by such a system. Maybe “coding” is a better word? E.g., neural coding. These systems can implement Turing machines so can potentially have the same properties of turing machines.
I see. I think I was confused since, in my mind, there are many Turing machines that simply do not “optimize” anything. They just compute a function.
I’m wondering if our disagreement is conceptual or semantic. Optimizing a direction instead of an entire path is just a difference in time horizon in my model. But maybe this is a different use of the word “optimize”?
I think I wanted to point to a difference in the computational approach of different algorithms that find a path through the universe. If you chain together many locally found heuristics, then you carve out a path through reality over time that may lead to some “desirable outcome”. But the computation would be vastly different from another algorithm that thinks about the end result and then makes a whole plan of how to reach this. It’s basically the difference between deontology and consequentialism. This post is on similar themes.
I’m not at all sure if we disagree about anything here, though.
If I learn the optimal path to work, then I can use that multiple times. I’m not sure I agree with the distinction you are drawing here … Some problems in life only need to be solved exactly once, but that’s the same as any thing you learn only being applicable once.
I would say that if you remember the plan and retrieve it later for repeated use, then you do this by learning and the resulting computation is not planning anymore. Planning is always the thing you do at the moment to find good results now, and learning is the thing you do to be able to use a solution repeatedly.
Part of my opinion also comes from the intuition that planning is the thing that derives its use from the fact that it is applied in complex environments in which learning by heart is often useless. The very reason why planning is useful for intelligent agents is that they cannot simply learn heuristics to navigate the world.
To be fair, it might be that I don’t have the same intuitive connection between planning and learning in my head that you do, so if my comments are beside the point, then feel free to ignore :)
A hyperparameter is a parameter across parameters. So say with childbirth, you have a parameter pain on physical pain which is a direct physical signal, and you have a hyperparameter ‘Satisfaction from hard work’ that takes ‘pain’ as input as well as some evaluative cognitive process and outputs reward accordingly. Does that make sense?
Conceptually it does, thank you! I wouldn’t call these parameters and hyperparameters, though. Low-level and high-level features might be better terms.
Again, I think the shard theory of human values might be an inspiration for these thoughts, as well as this post on AGI motivation which talks about how valence gets “painted” on thoughts in the world model of a brain-like AGI.
Is this on the sweet spot just before overfitting or should I be thinking of something else?
I personally don’t have good models for this. Ilya Sutskever mentioned in a podcast that under some models of bayesian updating, learning by heart is optimal and a component of perfect generalization. Also from personal experience, I think that people who generalize very well also often have lots of knowledge, though this may be confounded by other effects.
Did you accidentally forget to add this post to your research journal sequence?
Here my quick reactions on many of the points in the post:
That sounds as if as if they are always finitely terminating or convergent, which they’re not. (I don’t think you wanted to say they are)
I don’t quite understand this. What does the sentence “computational optimization can compute all computable functions” mean? Additionally, in my conception of “computational optimization” (which is admittedly rather vague), learning need not take place.
I find these analogies and your explanations a bit vague. What makes it hard for me to judge what’s behind these analogies:
You write “Intelligence = Mapping current world state to target world state (or target direction)”:
these two options are conceptually quite different and might influence the meaning of the analogy. If intelligence computes only a “target direction”, then this corresponds to a heuristic approach in which locally, the correct direction in action space is chosen. However, if you view intelligence as an actual optimization algorithm, then what’s chosen is not only a direction but a whole path.
Further nitpick: I wouldn’t use the verb “to map” here. I think you mean more something like “to transform”, especially if you mean the optimization viewpoint.
You write “Learning consists of setting the right weights between all the neurons in all the layers. This is analogous to my understanding of human intelligence as path-finding through reality”
Learning is a thing you do once, and then you use the resulting neural network repeatedly. In contrast, if you search for a path, you usually use that path only once.
The output of a neural network can be a found path itself. That makes the analogy even more difficult to me.
Off-policy reinforcement learning is built on this idea. One famous example is DQN, which uses experience replay. The paper is still worth reading today; some consider it the start of deep RL.
I think the terms “our utility function” and “encoded” are not well-defined enough to be able to outright say whether this is true or not, but under a reasonable interpretation of the terms, it seems correct to me.
I do not know what you mean by “mapping a utility function to world states”. Is the following a correct paraphrasing of what you mean?
“An aligned AGI is one that tries to steer toward world states such that the neurally encoded utility function, if queried, would say ‘these states are rather optimal’ ”
There are indeed biases in our decision-making that mean that the utility function cannot be inferred from our behavior alone, as shown in humans can be assigned any values whatsoever.
I also don’t think it’s feasible to directly measure our utility function. In my own view, our utility function isn’t an observable thing. There might be a utility function that gets revealed by running history far into the future and observing on what humans converge on, but I don’t think the end result of what we value can be directly measured in our brains.
We cannot gain “utility”. We can only gain “reward”. Utility is a measure of world states, whereas reward is a thing happening in our brains.
I don’t quite understand the analogy to hyperparameters here. To me, it seems like childbirth’s meaning is in itself a reward that, by credit assignment, leads to a positive evaluation of the actions that led to it, even though in the experience the reward was mostly negative. It is indeed interesting figuring out what exactly is going on here (and the shard theory of human values might be an interesting frame for that, see also this interesting post looking at how the same external events can trigger different value updates), but I don’t yet see how it connects to hyperparameters.
What if instead of trying to build an AI that tries to decode our brain’s utility function, we build the process that created our values in the first place and expose the AI to this process?
The distinction between vision and language models breaks down with things like vision transformers. But in general, the lack of grounding of pure language models seems a problem to me for reaching AGI with it. But I think a language model that interacts with the world through, e.g., an internet connection, might already get rid of this grounding problem.
That seems pretty plausible to me for achieving AGI, but many RL agents do not have an explicit self-supervised component.
I mostly agree. But note that feature engineering is just a form of an inductive prior, and it’s not possible to get rid of those and let them be “learned”—there is no free lunch.
Many models that do not overfit also memorize much of the data set.
I thought I added it but apparently hadn’t pressed submit. Thank you for pointing that out!
I was going by the Wikipedia definition:
I might have overloaded the phrase “computational” here. My intention was to point out what can be encoded by such a system. Maybe “coding” is a better word? E.g., neural coding. These systems can implement Turing machines so can potentially have the same properties of turing machines.
I’m wondering if our disagreement is conceptual or semantic. Optimizing a direction instead of an entire path is just a difference in time horizon in my model. But maybe this is a different use of the word “optimize”?
If I learn the optimal path to work, then I can use that multiple times. I’m not sure I agree with the distinction you are drawing here … Some problems in life only need to be solved exactly once, but that’s the same as any thing you learn only being applicable once. I didn’t mean to claim the processes are identical, but that they share an underlying structure. Though indeed, this might an empty intuitive leap with no useful implementation. Or maybe not a good matching at all.
Yes, thank you.
A hyperparameter is a parameter across parameters. So say with childbirth, you have a parameter pain on physical pain which is a direct physical signal, and you have a hyperparameter ‘Satisfaction from hard work’ that takes ‘pain’ as input as well as some evaluative cognitive process and outputs reward accordingly. Does that make sense?
Digging in to shard theory is still on my todo list. [bookmarked]
Is this on the sweet spot just before overfitting or should I be thinking of something else?
Thank you for you extensive comment! <3
I see. I think I was confused since, in my mind, there are many Turing machines that simply do not “optimize” anything. They just compute a function.
I think I wanted to point to a difference in the computational approach of different algorithms that find a path through the universe. If you chain together many locally found heuristics, then you carve out a path through reality over time that may lead to some “desirable outcome”. But the computation would be vastly different from another algorithm that thinks about the end result and then makes a whole plan of how to reach this. It’s basically the difference between deontology and consequentialism. This post is on similar themes.
I’m not at all sure if we disagree about anything here, though.
I would say that if you remember the plan and retrieve it later for repeated use, then you do this by learning and the resulting computation is not planning anymore. Planning is always the thing you do at the moment to find good results now, and learning is the thing you do to be able to use a solution repeatedly.
Part of my opinion also comes from the intuition that planning is the thing that derives its use from the fact that it is applied in complex environments in which learning by heart is often useless. The very reason why planning is useful for intelligent agents is that they cannot simply learn heuristics to navigate the world.
To be fair, it might be that I don’t have the same intuitive connection between planning and learning in my head that you do, so if my comments are beside the point, then feel free to ignore :)
Conceptually it does, thank you! I wouldn’t call these parameters and hyperparameters, though. Low-level and high-level features might be better terms.
Again, I think the shard theory of human values might be an inspiration for these thoughts, as well as this post on AGI motivation which talks about how valence gets “painted” on thoughts in the world model of a brain-like AGI.
I personally don’t have good models for this. Ilya Sutskever mentioned in a podcast that under some models of bayesian updating, learning by heart is optimal and a component of perfect generalization. Also from personal experience, I think that people who generalize very well also often have lots of knowledge, though this may be confounded by other effects.