then, there’s the question of “aligned to what”? Whose intent? What would success at this agenda look like?
Maybe: A superintelligence that accurately models its human operator, follows the human’s intent[1] to complete difficult-but-bounded tasks, and is runnable at human-speed with manageable amount of compute, sitting on OAI’s servers?
Who would get to use that superintelligence? For what purpose would they use it? How long before the {NSA, FSB, CCP, …} steal that superintelligence off OAI’s servers? What would they use it for?
Point being: If an organization is not adequate in all key dimensions of operational adequacy, then even if they somehow miraculously solve the alignment/control problem, they might be increasing S-risks while only somewhat decreasing X-risks.
What is OAI’s plan for getting their opsec and common-good-commitment to adequate levels? What’s their plan for handling success at alignment/control?
What do they mean by “aligned”?
OK. Assuming that
sharp left turns are not an issue,
and scalable oversight is even possible in practice,
and OAI somehow solves the problems of
AIs hacking humans (to influence their intents),
and deceptive alignment,
humans going crazy when given great power,
etc.
and all the problems no-one has noticed yet,
then, there’s the question of “aligned to what”? Whose intent? What would success at this agenda look like?
Maybe: A superintelligence that accurately models its human operator, follows the human’s intent[1] to complete difficult-but-bounded tasks, and is runnable at human-speed with manageable amount of compute, sitting on OAI’s servers?
Who would get to use that superintelligence? For what purpose would they use it? How long before the {NSA, FSB, CCP, …} steal that superintelligence off OAI’s servers? What would they use it for?
Point being: If an organization is not adequate in all key dimensions of operational adequacy, then even if they somehow miraculously solve the alignment/control problem, they might be increasing S-risks while only somewhat decreasing X-risks.
What is OAI’s plan for getting their opsec and common-good-commitment to adequate levels? What’s their plan for handling success at alignment/control?
and does not try to hack the human into having more convenient intents