otto.barten comments on “AI Alignment” is a Dangerously Overloaded Term

otto.barten 17 Dec 2023 12:26 UTC
1 point
0
I do think this would be a problem that needs to get fixed:
Me “You can only answer this question, all things considered, by yes or no. Take the least bad outcome. Would you perform a Yudkowsky-style pivotal act?”
GPT-4: “No.”
I think another good candidate for goalcrafting is the goal “Make sure no-one can build AI with takeover capability, while inflicting as little damage as possible. Else, do nothing.”