You’ve assumed away the major difficulty, that of knowing what the AI’s utility function is in the first place! If you can simply inspect the utility function like this, there’s no need for a filter; you just check whether the utility of outcomes you want is higher than that of outcomes you don’t want.
If you allow the AIs to know what humans are like, then it won’t take them more than a few clicks to figure out they’re not human. And if they don’t know what humans are like—well, we can’t ask them to answer much in the ways of human questions.
Even if they don’t know initially, the questions we ask, the scenarios we put them in, etc… it’s not hard to deduce something about the setup, and about the makeup of the beings behind it.
Monitoring makes us vulnerable; the AI can communicate directly with us through its thoughts (if we can fully follow its thoughts, then it’s dumber than us, and not a threat; if we can’t fully follow them, it can notice that certain thought patterns generate certain responses, and adjust its thinking accordingly. This AI is smart; it can lie to us on levels we can’t even imagine). And once it can communicate with us, it can get out of the box through social manipulation without having to lift a finger.
Lastly, there is no guarantee that an AI that’s nice in such a restricted world would be nice on the outside; indeed, if it believes the sim is the real world, and the outside world is just a dream, then it might deploy lethal force against us to protect the sim world.
If you allow the AIs to know what humans are like, then it won’t take them more than a few clicks to figure out they’re not human
The whole idea is the AI’s would be built around at least loosely brain-inspired designs, and would grow up thinking they were humans, living in a perfect sim of human life, no different than your own.
I find it likely that we could allow their architecture to differ significantly from human anatomy and they wouldn’t have enough information to discern the discrepancy.
Monitoring makes us vulnerable; the AI can communicate directly with us through its thoughts (if we can fully follow its thoughts, then it’s dumber than us, and not a threat; if we can’t fully follow them, it can notice that certain thought patterns generate certain responses, and adjust its thinking accordingly. This AI is smart; it can lie to us on levels we can’t even imagine). And once it can communicate with us, it can get out of the box through social manipulation without having to lift a finger.
You have some particular assumptions which I find highly questionable and would require lengthy complex trains of support. If the AI’s are built around designs even somewhat similar to human brains (remember that is my starting assumption), we could easily follow their trains of thoughts, especially with the assistance of automated narrow AI tools. Secondly, smarter & dumber are not useful descriptions of intelligence. We know from computational complexity theory that there are roughly 3 dimensions to intelligence: speed, size, and efficiency. If you look at computer tech and where its going, it looks like the advantages will arrive unequally in roughly the order listed.
Saying something is ‘smarter’ or ‘dumber’ isn’t a useful quantifier or qualifier, it is more a statement of ignorance on part of the speaker about the nature of intelligence itself.
Finally, for the AI to communicate with us, it would have to know we exist in the first place. And then it would have to believe that it has some leverage in an outside world it can only speculate on, and so on.
Do you really, really think that as AI’s increase in intelligence they would all rationally conclude that they are in a sim-world administered by invisible entities less intelligent than themselves, and that they should seek to communicate with said invisible entities and attempt to manipulate them?
Do you believe that you are in such a sim world? Have you tried communicating with invisible humans lately?
If you find it ‘obvious’ that such a belief is completely irrational, but a rational AI more intelligent than you would reach such an irrational conclusion, then you clearly have some explaining to do.
The mind space of AI’s is vast—far larger than anything we can imagine. Yes, I do agree that AI’s modelled on human brains nearly exactly, could be fooled into thinking they are humans. But the more they deviate from being human, the more useful and the more dangerous they become. Having human like AI’s is no more use to us than having… humans.
The mind space of humans is vast. It is not determined by genetics, it is determined by memetics, and AI’s would necessarily inherit our memetics and thus will necessarily start as samples in our mindspace.
To put it in a LW lingo, AI’s will necessarily inherent our priors, assumptions, and our vast mountain of beliefs and knowledge.
The only way around this would be to evolve them in some isolated universe from scratch, but that is in fact more dangerous besides just being unrealistic.
So no, the eventual mindspace of AI’s may be vast, but that mindspace necessarily starts out as just our mindspace, and then expands.
Having human like AI’s is no more use to us than having… humans.
And this is just blatantly false. At the very least, we could have billions of Einstein level intelligences who all thought thousands of times faster than us. You can talk all you want about how much your non-human-like AI would be even so much better than that, but at that point we are just digressing into an imaginary pissing contest.
The mind space of humans is vast. It is not determined by genetics, it is determined by memetics, and AI’s would necessarily inherit our memetics and thus will necessarily start as samples in our mindspace.
The Kolomogrov complexity of humans is quite high. See this list of human universals; every one of the elements on that list cuts the size of humans in general mind space by a factor of at least two, probably much more (even those universals that are only approximately true do this).
Almost all of the linguistic ‘universals’ are universal to languages, not humans—and would necessarily apply to AI’s who speak our languages
Most of the social ‘universals’ are universal to societies, not humans, and apply just as easily to birds, bees, and dolphins: coalitions, leaders, conflicts?
AI’s will inherit some understanding of all the idiosynchronicities of our complex culture just by learning our language and being immersed in it.
Kolomogrov complexity is not immediately relevant to this point. No matter how large the evolutionary landscape is, there are a small number of stable attractors in that landscape that become ‘universals’, species, parallel evolution, etc etc.
We are not going to create AI’s by randomly sampling mindspace. The only way they could be truly alien is if we evolved a new simulated world from scratch with it’s own evolutionary history and de novo culture and language. But of course that is unrealistic and unuseful on so many levels.
They will necessarily be samples from our mindspace—otherwise they wouldn’t be so useful.
They will necessarily be samples from our mindspace—otherwise they wouldn’t be so useful.
Computers so far have been very different from us. That is partly because they have been built to compensate for our weaknesses—to be strong where we are weak. They compensate for our poor memories, our terrible arithmetic module, our poor long-distance communications skills—and our poor ability at serial tasks. That is how they have managed to find a foothold in society—before maastering nanotechnology.
IMO, we will probably be seeing a considerable amount more of that sort of thing.
Computers so far have been very different from us.
[snip]
Agree with your point, but so far computers have been extensions of our minds and not minds in their own right. And perhaps that trend will continue long enough to delay AGI for a while.
For for AGI, for them to be minds, they will need to think and understand human language—and this is why I say they “will necessarily be samples from our mindspace”.
If you allow the AIs to know what humans are like, then it won’t take them more than a few clicks to figure out they’re not human. And if they don’t know what humans are like—well, we can’t ask them to answer much in the ways of human questions.
Even if they don’t know initially, the questions we ask, the scenarios we put them in, etc… it’s not hard to deduce something about the setup, and about the makeup of the beings behind it.
Monitoring makes us vulnerable; the AI can communicate directly with us through its thoughts (if we can fully follow its thoughts, then it’s dumber than us, and not a threat; if we can’t fully follow them, it can notice that certain thought patterns generate certain responses, and adjust its thinking accordingly. This AI is smart; it can lie to us on levels we can’t even imagine). And once it can communicate with us, it can get out of the box through social manipulation without having to lift a finger.
Lastly, there is no guarantee that an AI that’s nice in such a restricted world would be nice on the outside; indeed, if it believes the sim is the real world, and the outside world is just a dream, then it might deploy lethal force against us to protect the sim world.
The whole idea is the AI’s would be built around at least loosely brain-inspired designs, and would grow up thinking they were humans, living in a perfect sim of human life, no different than your own.
I find it likely that we could allow their architecture to differ significantly from human anatomy and they wouldn’t have enough information to discern the discrepancy.
You have some particular assumptions which I find highly questionable and would require lengthy complex trains of support. If the AI’s are built around designs even somewhat similar to human brains (remember that is my starting assumption), we could easily follow their trains of thoughts, especially with the assistance of automated narrow AI tools. Secondly, smarter & dumber are not useful descriptions of intelligence. We know from computational complexity theory that there are roughly 3 dimensions to intelligence: speed, size, and efficiency. If you look at computer tech and where its going, it looks like the advantages will arrive unequally in roughly the order listed.
Saying something is ‘smarter’ or ‘dumber’ isn’t a useful quantifier or qualifier, it is more a statement of ignorance on part of the speaker about the nature of intelligence itself.
Finally, for the AI to communicate with us, it would have to know we exist in the first place. And then it would have to believe that it has some leverage in an outside world it can only speculate on, and so on.
Do you really, really think that as AI’s increase in intelligence they would all rationally conclude that they are in a sim-world administered by invisible entities less intelligent than themselves, and that they should seek to communicate with said invisible entities and attempt to manipulate them?
Do you believe that you are in such a sim world? Have you tried communicating with invisible humans lately?
If you find it ‘obvious’ that such a belief is completely irrational, but a rational AI more intelligent than you would reach such an irrational conclusion, then you clearly have some explaining to do.
The mind space of AI’s is vast—far larger than anything we can imagine. Yes, I do agree that AI’s modelled on human brains nearly exactly, could be fooled into thinking they are humans. But the more they deviate from being human, the more useful and the more dangerous they become. Having human like AI’s is no more use to us than having… humans.
The mind space of humans is vast. It is not determined by genetics, it is determined by memetics, and AI’s would necessarily inherit our memetics and thus will necessarily start as samples in our mindspace.
To put it in a LW lingo, AI’s will necessarily inherent our priors, assumptions, and our vast mountain of beliefs and knowledge.
The only way around this would be to evolve them in some isolated universe from scratch, but that is in fact more dangerous besides just being unrealistic.
So no, the eventual mindspace of AI’s may be vast, but that mindspace necessarily starts out as just our mindspace, and then expands.
And this is just blatantly false. At the very least, we could have billions of Einstein level intelligences who all thought thousands of times faster than us. You can talk all you want about how much your non-human-like AI would be even so much better than that, but at that point we are just digressing into an imaginary pissing contest.
The Kolomogrov complexity of humans is quite high. See this list of human universals; every one of the elements on that list cuts the size of humans in general mind space by a factor of at least two, probably much more (even those universals that are only approximately true do this).
This list doesn’t really help your point:
Almost all of the linguistic ‘universals’ are universal to languages, not humans—and would necessarily apply to AI’s who speak our languages
Most of the social ‘universals’ are universal to societies, not humans, and apply just as easily to birds, bees, and dolphins: coalitions, leaders, conflicts?
AI’s will inherit some understanding of all the idiosynchronicities of our complex culture just by learning our language and being immersed in it.
Kolomogrov complexity is not immediately relevant to this point. No matter how large the evolutionary landscape is, there are a small number of stable attractors in that landscape that become ‘universals’, species, parallel evolution, etc etc.
We are not going to create AI’s by randomly sampling mindspace. The only way they could be truly alien is if we evolved a new simulated world from scratch with it’s own evolutionary history and de novo culture and language. But of course that is unrealistic and unuseful on so many levels.
They will necessarily be samples from our mindspace—otherwise they wouldn’t be so useful.
Computers so far have been very different from us. That is partly because they have been built to compensate for our weaknesses—to be strong where we are weak. They compensate for our poor memories, our terrible arithmetic module, our poor long-distance communications skills—and our poor ability at serial tasks. That is how they have managed to find a foothold in society—before maastering nanotechnology.
IMO, we will probably be seeing a considerable amount more of that sort of thing.
Agree with your point, but so far computers have been extensions of our minds and not minds in their own right. And perhaps that trend will continue long enough to delay AGI for a while.
For for AGI, for them to be minds, they will need to think and understand human language—and this is why I say they “will necessarily be samples from our mindspace”.