See Limitations on Formal Verification for AI Safety over on LessWrong. I have a lot of agreements, and my disagreements are more a matter of what deserves emphasis than the fundamentals. Overall, I think the Tegmark/Omohundro paper failed to convey a swisscheesey worldview, and sounded too much like “why not just capture alignment properties in ‘specs’ and prove the software ‘correct’?” (i.e. the vibe I was responding to in my very pithy post). However, I think my main reason I’m not using Dickson’s post as a reason to just pivot all my worldview and resulting research is captured in one of Steve’s comments:
I’m focused on making sure our infrastructure is safe against AI attacks.
Like, a very strong version I almost endorse is “GSAI isn’t about AI at all, it’s about systems coded by extremely powerful developers (which happen to be AIs)”, and ensuring safety, security, and reliability capabilities scale at similar speeds with other kinds of capabilities.
It looks like one can satisfy Dickson just by assuring him that GSAI is a part of a swiss cheese stack, and that no one is messianically promoting One Weird Trick To Solve Alignment. Of course, I do hope that no one is messianically promoting One Weird Trick…
Note in August 2024 GSAI newsletter
See Limitations on Formal Verification for AI Safety over on LessWrong. I have a lot of agreements, and my disagreements are more a matter of what deserves emphasis than the fundamentals. Overall, I think the Tegmark/Omohundro paper failed to convey a swisscheesey worldview, and sounded too much like “why not just capture alignment properties in ‘specs’ and prove the software ‘correct’?” (i.e. the vibe I was responding to in my very pithy post). However, I think my main reason I’m not using Dickson’s post as a reason to just pivot all my worldview and resulting research is captured in one of Steve’s comments:
Like, a very strong version I almost endorse is “GSAI isn’t about AI at all, it’s about systems coded by extremely powerful developers (which happen to be AIs)”, and ensuring safety, security, and reliability capabilities scale at similar speeds with other kinds of capabilities.
It looks like one can satisfy Dickson just by assuring him that GSAI is a part of a swiss cheese stack, and that no one is messianically promoting One Weird Trick To Solve Alignment. Of course, I do hope that no one is messianically promoting One Weird Trick…