Zach Stein-Perlman comments on Zach Stein-Perlman’s Shortform

Zach Stein-Perlman 30 Oct 2024 19:00 UTC
10 points
0
Some not-super-ambitious asks for labs (in progress):
- Do evals on on dangerous-capabilities-y and agency-y tasks; look at the score before releasing or internally deploying the model
- Have a high-level capability threshold at which securing model weights is very important
- Do safety research at all
- Have a safety plan at all; talk publicly about safety strategy at all
  - Have a post like Core Views on AI Safety
  - Have a post like The Checklist
  - On the website, clearly describe a worst-case plausible outcome from AI and state credence in such outcomes (perhaps unconditionally, conditional on no major government action, and/or conditional on leading labs being unable or unwilling to pay a large alignment tax).
- Whistleblower protections?
  - [Not sure what the ask is. Right to Warn is a starting point. In particular, an anonymous internal reporting pipeline for failures to comply with safety policies is clearly good (but likely inadequate).]
- Publicly explain the processes and governance structures that determine deployment decisions
  - (And ideally make those processes and structures good)