Noosphere89 comments on What do you expect AI capabilities may look like in 2028?

Noosphere89 24 Aug 2024 17:10 UTC
2 points
−2
The big issue, in terms of AI safety, is likely to be misuse, not alignment issues, primarily because I expect these AGIs to exhibit quite a lot less instrumental convergence than humans out of the box, due to being trained on much denser data and rewards than humans, and I think this allows for corrigibility/DWIMAC strategies to alignment to mostly just work.

However, misuse of AIs will become a harder problem to solve, and short term, I expect the solution to be never releasing unrestricted AI to the general public and only allowing unrestricted AIs for internal use like AI research, unless it has robust resistance to fine-tuning attacks, and longer term, I think the solution will have to require more misuse-resistant AIs.

Also, in the world you sketched, with my additions, the political values of who control AIs become very important, for better or worse.