I think that aside from the declaration and the promise for more summits the creation of the AI Safety Institute and its remit are really good, explicitly mentioning auto-replication and deception evals and planning to work with the likes of Apollo Research and ARC evals to test for:
I find this especially interesting because we now know that in the absence of any empirical evidence of any instance of deceptive alignment at least one major government is directing resources to developing deception evals anyway. If your model of politics doesn’t allow for this or considers it unlikely then you need to reevaluate it as Matthew Barnett said.
Additionally, the NIST consortium and AI Safety Institute both strike me as useful national-level implementations of the ‘AI risk evaluation consortium’ idea proposed by TFI.
King Charles notes (0:43 clip) that AI is getting very powerful and that dealing with it requires international coordination and cooperation. Good as far as it goes.
I find it amusing that for the first time in hundreds of years a king is once again concerned about superhuman non-physical threats (at least if you’re a mathematical platonist about algorithms and predict instrumental convergence as a fundamental property of powerful minds) to his kingdom and the lives of his subjects. :)
I think that aside from the declaration and the promise for more summits the creation of the AI Safety Institute and its remit are really good, explicitly mentioning auto-replication and deception evals and planning to work with the likes of Apollo Research and ARC evals to test for:
Also, NIST is proposing something similar.
I find this especially interesting because we now know that in the absence of any empirical evidence of any instance of deceptive alignment at least one major government is directing resources to developing deception evals anyway. If your model of politics doesn’t allow for this or considers it unlikely then you need to reevaluate it as Matthew Barnett said.
Additionally, the NIST consortium and AI Safety Institute both strike me as useful national-level implementations of the ‘AI risk evaluation consortium’ idea proposed by TFI.
I find it amusing that for the first time in hundreds of years a king is once again concerned about superhuman non-physical threats (at least if you’re a mathematical platonist about algorithms and predict instrumental convergence as a fundamental property of powerful minds) to his kingdom and the lives of his subjects. :)