Well it depends on your priors for how an AGI would act, but as I understand it, all AGIs will be powerseeking. If an AGI is powerseeking, and has access to some amount of compute, then it will probably bootstrap itself to superintelligence, and then start pushing its utility function all over. Different utility functions cause different results, but even relatively mundane ones like “prevent another superintelligence from being created” could result in the AGI killing all humans and taking over the galaxy to make sure no other superintelligence gets made. I think it’s actually really really hard to specify the what-we-actually-want future for an AGI, so much so that evolutionarily training an AGI in an Earth-like environment so it develops human-ish morals will be necessary.
Aye, I agree it is not a solution to avoiding power seeking, only that there may be a slightly easier target to hit if we can relax as many constraints on alignment as possible.
Well it depends on your priors for how an AGI would act, but as I understand it, all AGIs will be powerseeking. If an AGI is powerseeking, and has access to some amount of compute, then it will probably bootstrap itself to superintelligence, and then start pushing its utility function all over. Different utility functions cause different results, but even relatively mundane ones like “prevent another superintelligence from being created” could result in the AGI killing all humans and taking over the galaxy to make sure no other superintelligence gets made. I think it’s actually really really hard to specify the what-we-actually-want future for an AGI, so much so that evolutionarily training an AGI in an Earth-like environment so it develops human-ish morals will be necessary.
Aye, I agree it is not a solution to avoiding power seeking, only that there may be a slightly easier target to hit if we can relax as many constraints on alignment as possible.