Seth Herd comments on We have to Upgrade

Seth Herd 24 Mar 2023 19:11 UTC
1 point
0
I agree with all of the premises. This timeline is short even for AGI safety people, but it also seems quite plausible.
I think there are people thinking about aligning true intelligence (that is, agentic, continually learning, and therefore self-teaching and probably self-improving in architecture). Unfortunately, that doesn’t change the logic, because those people tend to have very pessimistic views on our odds of aligning such a system. I put Nate Soares, Eliezer Yudkowsky, and others in that camp.
There is a possible solution: build AGI that is human-like. The better humans among us are safely and stably aligned. Many individuals would be safe stewards of humanity’s future, even if they changed and enhanced themselves along the road.
Creating a fully humanlike AGI is an unlikely solution, since the timeline for that would be even longer than the timelines for effective upgrades by AI enhancement through BCI.
But there is already work on roughly human-like AGI. I put DeepMind’s focus on deep RL agents in this category. And there are proposed solutions that would produce at least short-term, if not long-term, alignment of that type of system. Steve Byrnes has proposed one such solution, and I’ve proposed a similar one.
Even partial success at this type of solution might keep loosely brianlike AGI aligned long enough for other solutions to be brought into play.