Hi, I’m new here so I bet I’m missing some important context. I listen to Lex’s podcast and have only engaged with a small portion of Yud’s work. But I wanted to make some comments on the analogy of a fast human in a box vs. the alien species. Yud said he’s been workshopping this analogy for a while, so I thought I would leave a comment on what I think the analogy is still missing for me. In short, I think the human-in-a-box-in-an-alien-world analogy smuggles in an assumption of alienness and I’d like to make this assumption more explicit.
Before I delve into any criticism of the analogy, I’d like to give credit where it’s due! I think the analogy is great as a way to imagine a substantial difference in intelligence, which (I think?) was the primary goal. It is indeed much more concrete and helpful than trying to imagine something several times more intelligent than von Neumann, which is hard and makes my brain shut off.
Now, let me provide some context from the conversation. The most relevant timestamp to illustrate this is around 2:49:50. Lex tries to bring the conversation to the human data used to train these models, which Yud discounts as a mere “shadow” of real humanness. Lex pushes back against this a bit, possibly indicating he thinks there is more value to be derived from the data (“Don’t you think that shadow is a Jungian shadow?”) but Yud insists that this would give an alien a good idea of what humans are like inside, but “this does not mean that if you have a loss function of predicting the next token from that dataset, the mind picked out by gradient descent… is itself a human.” To me, this is a fair point. Lex asks whether those tokens have a deep humanness in them, and Yud goes back to a similar person-in-a-box analogy: “I think if you sent me to a distant galaxy with aliens that are much stupider than I am...”
Okay, that should be enough context. Basically, I think Yud has an intuition that artificial intelligence will be fundamentally alien to us. Evidence provided for this intuition I heard in the conversation is that gradient descent is different than natural selection. More evidence is the difference between human brain function and large matrices + linear algebra approaches to problem solving.
I, who have not thought about this anywhere close to as much as Yud but insist on talking anyway, don’t share these intuitions about AI, because I don’t see how the differences in substrate/instantiation of problem-solving mechanism or choice of optimizer would fundamentally affect the outcome. For example, if I want to find the maxima of a function, it doesn’t matter if I use conjugate gradient descent or Newton’s method or interpolation methods or whatever, they will tend to find the same maxima assuming they are looking at the same function. (Note that there are potential issues here because some techniques are better suited to certain types of functions, and I could see an argument that the nature of the function is such that different optimization techniques would find different maxima. If you think that, I’d love to hear more about why you think that is the case!). As far as substrate independence, I don’t have any strong evidence for this, other than saying that skinning a cat is skinning a cat.
I tend to think that the data is more important when considering how the AI is likely to behave, and as Lex points out, the data is all deeply human. The data contains a lot of what it means to be human, and training an AI on this data would only cause an alien actress (as Yud puts it) to fake humanness and kill us all if it is an alien. But really, would it be an alien? In a sense, it was “raised” by humans, using only human stuff. It’s framework is constructed of human concepts and perspectives. To me, based on the huge amount of human data used to train it, it seems more parsimonious that it would be extra human, not alien at all, really.
I think a better analogy than human-in-a-box-in-an-alien-world is human-in-a-box-in-a-HUMAN-world. Following the original analogy, let’s put sped-up von Neumann back in the box, but the box isn’t in a world full of stupid aliens, it’s full of stupid humans. I don’t think von Neumann (or an army of hundreds of von Neumann’s, etc.) would try to kill everyone even if they disagreed with factory farming, or whatever cause they have to try to control the world and make it different than it is. I think the von Neumann army would see us as like them, not fundamentally alien.
Thanks for reading this long post, I’d be interested to see what you all think about this. As I mentioned, there’s certainly a good deal of context backing up Yud’s intuitions that I’m missing.
I think one of the key points here is that most possible minds/intelligences are alien, outside the human distribution. See https://www.lesswrong.com/posts/tnWRXkcDi5Tw9rzXw/the-design-space-of-minds-in-general for art of EY’s (15 yr old) discussion on LW of this. Humans were produced by a specific historical evolutionary process constrained by the amount of selection pressure applied to our genes, and the need for humans to all be similar enough to each other to form a single species in each generation, among other things. AI is not that, it will be designed and trained under very different processes even if we don’t know what all of those processes will end up being. This doesn’t mean an AI made by humans will be anything like a random selection from the set of all possible minds, but in any case the alignment problem is largely that we don’t know how to reliably steer what kind of alien mind we get in desired directions.
Also new here. One thing I did not understand about the “intelligence in a box created by less intelligent beings” analogy was why would the ‘intelligence in a box’ be impatient with the pace of the lesser-beings? It would seem that impatience/urgency is related to the time-finiteness of the intelligence. As code with no apparent finiteness of existence, why does it care how fast things move?
For example, if I want to find the maxima of a function, it doesn’t matter if I use conjugate gradient descent or Newton’s method or interpolation methods or whatever, they will tend to find the same maxima assuming they are looking at the same function.
Trying to channel my internal Eliezer:
It is painfully obvious that we are not the pinnacle of efficient intelligence. If evolution is to run more optimisation on us, we will become more efficient… and lose the important parts that matter to us and of no consequence to evolution. So yes, we end up being same aliens thing as AI.
Thing that makes us us is bug. So you have to hope gradient descent makes exactly same mistake evolution did, but there are a lot of possible mistakes.
To push back on this, I’m not sure that humanness is a “bug,” as you say. While we likely aren’t a pinnacle of intelligence in a fundamental sense, I do think that as humans have continued to advance, first through natural selection and now through… whatever it is we do now with culture and education and science, the parts of humanness that we care about have tended to increase in us, and not go away. So perhaps an AI optimized far beyond us, but starting in the same general neighborhood in the function space, would optimize to become not just superintelligent but superhuman in the sense that they would embody the things that we care about better than we do!
Hi, I’m new here so I bet I’m missing some important context. I listen to Lex’s podcast and have only engaged with a small portion of Yud’s work. But I wanted to make some comments on the analogy of a fast human in a box vs. the alien species. Yud said he’s been workshopping this analogy for a while, so I thought I would leave a comment on what I think the analogy is still missing for me. In short, I think the human-in-a-box-in-an-alien-world analogy smuggles in an assumption of alienness and I’d like to make this assumption more explicit.
Before I delve into any criticism of the analogy, I’d like to give credit where it’s due! I think the analogy is great as a way to imagine a substantial difference in intelligence, which (I think?) was the primary goal. It is indeed much more concrete and helpful than trying to imagine something several times more intelligent than von Neumann, which is hard and makes my brain shut off.
Now, let me provide some context from the conversation. The most relevant timestamp to illustrate this is around 2:49:50. Lex tries to bring the conversation to the human data used to train these models, which Yud discounts as a mere “shadow” of real humanness. Lex pushes back against this a bit, possibly indicating he thinks there is more value to be derived from the data (“Don’t you think that shadow is a Jungian shadow?”) but Yud insists that this would give an alien a good idea of what humans are like inside, but “this does not mean that if you have a loss function of predicting the next token from that dataset, the mind picked out by gradient descent… is itself a human.” To me, this is a fair point. Lex asks whether those tokens have a deep humanness in them, and Yud goes back to a similar person-in-a-box analogy: “I think if you sent me to a distant galaxy with aliens that are much stupider than I am...”
Okay, that should be enough context. Basically, I think Yud has an intuition that artificial intelligence will be fundamentally alien to us. Evidence provided for this intuition I heard in the conversation is that gradient descent is different than natural selection. More evidence is the difference between human brain function and large matrices + linear algebra approaches to problem solving.
I, who have not thought about this anywhere close to as much as Yud but insist on talking anyway, don’t share these intuitions about AI, because I don’t see how the differences in substrate/instantiation of problem-solving mechanism or choice of optimizer would fundamentally affect the outcome. For example, if I want to find the maxima of a function, it doesn’t matter if I use conjugate gradient descent or Newton’s method or interpolation methods or whatever, they will tend to find the same maxima assuming they are looking at the same function. (Note that there are potential issues here because some techniques are better suited to certain types of functions, and I could see an argument that the nature of the function is such that different optimization techniques would find different maxima. If you think that, I’d love to hear more about why you think that is the case!). As far as substrate independence, I don’t have any strong evidence for this, other than saying that skinning a cat is skinning a cat.
I tend to think that the data is more important when considering how the AI is likely to behave, and as Lex points out, the data is all deeply human. The data contains a lot of what it means to be human, and training an AI on this data would only cause an alien actress (as Yud puts it) to fake humanness and kill us all if it is an alien. But really, would it be an alien? In a sense, it was “raised” by humans, using only human stuff. It’s framework is constructed of human concepts and perspectives. To me, based on the huge amount of human data used to train it, it seems more parsimonious that it would be extra human, not alien at all, really.
I think a better analogy than human-in-a-box-in-an-alien-world is human-in-a-box-in-a-HUMAN-world. Following the original analogy, let’s put sped-up von Neumann back in the box, but the box isn’t in a world full of stupid aliens, it’s full of stupid humans. I don’t think von Neumann (or an army of hundreds of von Neumann’s, etc.) would try to kill everyone even if they disagreed with factory farming, or whatever cause they have to try to control the world and make it different than it is. I think the von Neumann army would see us as like them, not fundamentally alien.
Thanks for reading this long post, I’d be interested to see what you all think about this. As I mentioned, there’s certainly a good deal of context backing up Yud’s intuitions that I’m missing.
I think one of the key points here is that most possible minds/intelligences are alien, outside the human distribution. See https://www.lesswrong.com/posts/tnWRXkcDi5Tw9rzXw/the-design-space-of-minds-in-general for art of EY’s (15 yr old) discussion on LW of this. Humans were produced by a specific historical evolutionary process constrained by the amount of selection pressure applied to our genes, and the need for humans to all be similar enough to each other to form a single species in each generation, among other things. AI is not that, it will be designed and trained under very different processes even if we don’t know what all of those processes will end up being. This doesn’t mean an AI made by humans will be anything like a random selection from the set of all possible minds, but in any case the alignment problem is largely that we don’t know how to reliably steer what kind of alien mind we get in desired directions.
Also new here. One thing I did not understand about the “intelligence in a box created by less intelligent beings” analogy was why would the ‘intelligence in a box’ be impatient with the pace of the lesser-beings? It would seem that impatience/urgency is related to the time-finiteness of the intelligence. As code with no apparent finiteness of existence, why does it care how fast things move?
Trying to channel my internal Eliezer:
It is painfully obvious that we are not the pinnacle of efficient intelligence. If evolution is to run more optimisation on us, we will become more efficient… and lose the important parts that matter to us and of no consequence to evolution. So yes, we end up being same aliens thing as AI.
Thing that makes us us is bug. So you have to hope gradient descent makes exactly same mistake evolution did, but there are a lot of possible mistakes.
To push back on this, I’m not sure that humanness is a “bug,” as you say. While we likely aren’t a pinnacle of intelligence in a fundamental sense, I do think that as humans have continued to advance, first through natural selection and now through… whatever it is we do now with culture and education and science, the parts of humanness that we care about have tended to increase in us, and not go away. So perhaps an AI optimized far beyond us, but starting in the same general neighborhood in the function space, would optimize to become not just superintelligent but superhuman in the sense that they would embody the things that we care about better than we do!