A lot of my hope for “humans do not go extinct within the next 50 years” looks something like that, yeah (a lot of the rest is in “it turns out that language models are just straightforwardly easy to align, and that it’s just straightforwardly easy to teach them to use powerful tools”). If it turns out that “learn a heuristic that you should avoid irreversible actions that destroy complex and finely-tuned systems” is convergent that could maybe look like the “human reserve”.
There’s an anthropic argument that if that’s what the future looks like, most humans that ever live would live on a human reserve, and as such we should be surprised that we’re not. But I’m kinda suspicious of anthropic arguments.
Although it might be possible for various cyborg scenarios, where humans and AI co-exist, co-evolve, co-modify, etc., to follow the space expansion paradigm.
A lot of my hope for “humans do not go extinct within the next 50 years” looks something like that, yeah (a lot of the rest is in “it turns out that language models are just straightforwardly easy to align, and that it’s just straightforwardly easy to teach them to use powerful tools”). If it turns out that “learn a heuristic that you should avoid irreversible actions that destroy complex and finely-tuned systems” is convergent that could maybe look like the “human reserve”.
There’s an anthropic argument that if that’s what the future looks like, most humans that ever live would live on a human reserve, and as such we should be surprised that we’re not. But I’m kinda suspicious of anthropic arguments.
Although it might be possible for various cyborg scenarios, where humans and AI co-exist, co-evolve, co-modify, etc., to follow the space expansion paradigm.