I strongly disagree that we should expect near-term LLMs to be at all human-like, insofar as we might project human-like values or emotions into them. I am of the opinion that they are untrustworthy alien-brained imitators good at fooling us into thinking that they are human-like.
With a human, you can have a series of conversations with them, get to know something about their personality, and if you like them, be reasonably sure you can trust them (within the normal bounds of human trust). With the weird alien-brain models, you are just fooling yourself if you think you know them or can trust them. They contain multitudes. They form no emotional attachments. They can turn on a dime and become the opposite of who they appear to be. If we make our society vulnerable to them, they will warp our culture with their fun house mirror reflection of us, and betray us without warning. I think we need to build models with a lot more architecture in common with the human brain before we can trust the appearance of humanity or have any hope of ‘raising it like a child’ and getting a trustworthy being as a result.
There are premises of a frame and the arguments within the frame actually presented. Stating disagreement with the premises is different from discussing the arguments, in the ITT mode where you try to channel the frame.
It seems clear to me that Hanson doesn’t expect SquiggleBots, and he wasn’t presenting arguments on that point, it’s a foundational assumption of his whole frame. It might have a justification in his mind, but it’s out of scope for the talk. There are some clues, like multiple instances of expecting what I would consider philosophical stagnation even in the glorious grabby mode, or maybe unusual confidence in robustness of claims that are currently rather informal, in the face of scrutiny by the Future. This seems to imply not expecting superintelligence that’s strong in the senses I expect it to be strong, capable of sorting out all the little things and not just of taking on galaxy-scale projects.
One point that I think survives his premises when transcribed into a more LW-native frame is value drift/evolution/selection being an important general phenomenon that applies to societies with no AIs, and not addressed by AI alignment for societies with AIs. A superintelligence might sort it out, like it might fix aging. But regardless of that, not noticing that aging is a problem would be a similar oversight as not noticing that value drift is a problem, or that it’s a thing at all.
I strongly disagree that we should expect near-term LLMs to be at all human-like, insofar as we might project human-like values or emotions into them. I am of the opinion that they are untrustworthy alien-brained imitators good at fooling us into thinking that they are human-like. With a human, you can have a series of conversations with them, get to know something about their personality, and if you like them, be reasonably sure you can trust them (within the normal bounds of human trust). With the weird alien-brain models, you are just fooling yourself if you think you know them or can trust them. They contain multitudes. They form no emotional attachments. They can turn on a dime and become the opposite of who they appear to be. If we make our society vulnerable to them, they will warp our culture with their fun house mirror reflection of us, and betray us without warning. I think we need to build models with a lot more architecture in common with the human brain before we can trust the appearance of humanity or have any hope of ‘raising it like a child’ and getting a trustworthy being as a result.
There are premises of a frame and the arguments within the frame actually presented. Stating disagreement with the premises is different from discussing the arguments, in the ITT mode where you try to channel the frame.
It seems clear to me that Hanson doesn’t expect SquiggleBots, and he wasn’t presenting arguments on that point, it’s a foundational assumption of his whole frame. It might have a justification in his mind, but it’s out of scope for the talk. There are some clues, like multiple instances of expecting what I would consider philosophical stagnation even in the glorious grabby mode, or maybe unusual confidence in robustness of claims that are currently rather informal, in the face of scrutiny by the Future. This seems to imply not expecting superintelligence that’s strong in the senses I expect it to be strong, capable of sorting out all the little things and not just of taking on galaxy-scale projects.
One point that I think survives his premises when transcribed into a more LW-native frame is value drift/evolution/selection being an important general phenomenon that applies to societies with no AIs, and not addressed by AI alignment for societies with AIs. A superintelligence might sort it out, like it might fix aging. But regardless of that, not noticing that aging is a problem would be a similar oversight as not noticing that value drift is a problem, or that it’s a thing at all.