In all three cases, the AI you’re asking for is a superintelligent AGI. Each has to navigate a broad array of physically instantiated problems requiring coherent, goal oriented optimisation. No stateless, unembedded and temporally incoherent system like GPT-3 is going to be able to create nanotechnology, beat all human computer security experts, or convince everyone of your position.
Values arise to guide the actions that intelligence systems perform. Evolution did not arrange for us to form values because it liked human values. It did so because forming values is an effective strategy for getting more performance out of an agentic system, and SGD can figure this fact out just as easily as evolution.
If you optimise a system to be coherent and take actions in the real world, it will end up with values oriented around doing so effectively. Nature abhors a vacuum. If you don’t populate your superintelligent AGI with human-compatible values, some other values will arise and consume the free energy you’ve left around.
Interesting! I appreciate the details here; it gives me a better sense of why narrow ASI is probably not something that can exist. Is there a place we could talk over audio about AGI alignment versus text here on LessWrong? I’d like to get a better idea of the field, especially as I move into work like creating an AI Alignment Sandbox.
My Discord is Soareverix#7614 and my email is maarocket@gmail.com. I’d really appreciate the chance to talk with you over audio before I begin working on sharing alignment info and coming up with my own methods for solving the problem.
In all three cases, the AI you’re asking for is a superintelligent AGI. Each has to navigate a broad array of physically instantiated problems requiring coherent, goal oriented optimisation. No stateless, unembedded and temporally incoherent system like GPT-3 is going to be able to create nanotechnology, beat all human computer security experts, or convince everyone of your position.
Values arise to guide the actions that intelligence systems perform. Evolution did not arrange for us to form values because it liked human values. It did so because forming values is an effective strategy for getting more performance out of an agentic system, and SGD can figure this fact out just as easily as evolution.
If you optimise a system to be coherent and take actions in the real world, it will end up with values oriented around doing so effectively. Nature abhors a vacuum. If you don’t populate your superintelligent AGI with human-compatible values, some other values will arise and consume the free energy you’ve left around.
Interesting! I appreciate the details here; it gives me a better sense of why narrow ASI is probably not something that can exist. Is there a place we could talk over audio about AGI alignment versus text here on LessWrong? I’d like to get a better idea of the field, especially as I move into work like creating an AI Alignment Sandbox.
My Discord is Soareverix#7614 and my email is maarocket@gmail.com. I’d really appreciate the chance to talk with you over audio before I begin working on sharing alignment info and coming up with my own methods for solving the problem.