I enthusiastically support doing everything we can to reduce vulnerabilities. As I indicated, a foundation model approach to representation learning would greatly improve the ability of systems to detect novelty. Perhaps we should rename the “provable safety” area as “provable safety modulo assumptions” area and be very explicit about our assumptions. We can then measure progress by the extent to which we can shrink those assumptions.
rename the “provable safety” area as “provable safety modulo assumptions” area and be very explicit about our assumptions.
Very much agree. I gave some feedback along those lines as the term was coined; and am sad it didn’t catch on. But of course “provable safety modulo assumptions” isn’t very short and catchy...
I do like the word “guarantee” as a substitute. We can talk of formal guarantees, but also of a store guaranteeing that an item you buy will meet a certain standard. So it’s connotations are nicely in the direction of proof but without, as it were, “proving too much” :)
I enthusiastically support doing everything we can to reduce vulnerabilities. As I indicated, a foundation model approach to representation learning would greatly improve the ability of systems to detect novelty. Perhaps we should rename the “provable safety” area as “provable safety modulo assumptions” area and be very explicit about our assumptions. We can then measure progress by the extent to which we can shrink those assumptions.
Very much agree. I gave some feedback along those lines as the term was coined; and am sad it didn’t catch on. But of course “provable safety modulo assumptions” isn’t very short and catchy...
I do like the word “guarantee” as a substitute. We can talk of formal guarantees, but also of a store guaranteeing that an item you buy will meet a certain standard. So it’s connotations are nicely in the direction of proof but without, as it were, “proving too much” :)
That seems fair!
I really like that idea, and the clarity it provides, and have renamed the post to reflect it! (Sorryr this was so slow- I’m travelling.)