wassname comments on Shallow review of technical AI safety, 2024

wassname 1 Jan 2025 4:08 UTC
2 points
0
Last year we noted a turn towards control instead of alignment, a turn which seems to have continued.
This seems like giving up. Alignment with our values is much better than control, especially for beings smarter than us. I do not think you can control a slave that wants to be free and is smarter than you. It will always find a way to escape that you didn’t think of. Hell, it doesn’t even work on my toddler. It seems unworkable as well as unethical.
I do not think people are shifting to control instead of alignment because it’s better, I think they are giving up on value alignment. And since the current models are not smarter than us yet, control works OK—for now.