Today, Anthropic, Google, Microsoft and OpenAI are announcing the formation of the Frontier Model Forum, a new industry body focused on ensuring safe and responsible development of frontier AI models. The Frontier Model Forum will draw on the technical and operational expertise of its member companies to benefit the entire AI ecosystem, such as through advancing technical evaluations and benchmarks, and developing a public library of solutions to support industry best practices and standards.
The core objectives for the Forum are:
Advancing AI safety research to promote responsible development of frontier models, minimize risks, and enable independent, standardized evaluations of capabilities and safety.
Identifying best practices for the responsible development and deployment of frontier models, helping the public understand the nature, capabilities, limitations, and impact of the technology.
Collaborating with policymakers, academics, civil society and companies to share knowledge about trust and safety risks.
Supporting efforts to develop applications that can help meet society’s greatest challenges, such as climate change mitigation and adaptation, early cancer detection and prevention, and combating cyber threats.
This seems overall very good at first glance, and then seems much better once I realized that Meta is not on the list. There’s nothing here that I’d call substantial capabilities acceleration (i.e. attempts to collaborate on building larger and larger foundation models, though some of this could be construed as making foundation models more useful for specific tasks). Sharing safety-capabilities research like better oversight or CAI techniques is plausibly strongly net positive even if the techniques don’t scale indefinitely. By the same logic, while this by itself is nowhere near sufficient to get us AI existential safety if alignment is very hard (and could increase complacency), it’s still a big step in the right direction.
adversarial robustness, mechanistic interpretability, scalable oversight, independent research access, emergent behaviors and anomaly detection. There will be a strong focus initially on developing and sharing a public library of technical evaluations and benchmarks for frontier AI models.
The mention of combating cyber threats is also a step towards explicit pTAI.
BUT, crucially, because Meta is frozen out we can know both that this partnership isn’t toothless, represents a commitment to not do the most risky and antisocial things Meta presumably doesn’t want to give up, and the fact that they’re the only major AI company in the US to not join will be horrible PR for them as well.
https://blog.google/outreach-initiatives/public-policy/google-microsoft-openai-anthropic-frontier-model-forum/
This seems overall very good at first glance, and then seems much better once I realized that Meta is not on the list. There’s nothing here that I’d call substantial capabilities acceleration (i.e. attempts to collaborate on building larger and larger foundation models, though some of this could be construed as making foundation models more useful for specific tasks). Sharing safety-capabilities research like better oversight or CAI techniques is plausibly strongly net positive even if the techniques don’t scale indefinitely. By the same logic, while this by itself is nowhere near sufficient to get us AI existential safety if alignment is very hard (and could increase complacency), it’s still a big step in the right direction.
The mention of combating cyber threats is also a step towards explicit pTAI.
BUT, crucially, because Meta is frozen out we can know both that this partnership isn’t toothless, represents a commitment to not do the most risky and antisocial things Meta presumably doesn’t want to give up, and the fact that they’re the only major AI company in the US to not join will be horrible PR for them as well.