The specific conversation is much better than nothing—but I do think it ought to be emphasized that solving all the problems we’re aware of isn’t sufficient for safety. We’re training on the test set.[1] Our confidence levels should reflect that—but I expect overconfidence.
It’s plausible that RSPs could be net positive, but I think that given successful coordination [vague and uncertain] beats [significantly more concrete, but overconfident]. My presumption is that without good coordination (a necessary condition being cautious decision-makers), things will go badly.
RSPs seem likely to increase the odds we get some international coordination and regulation. But to get sufficient regulation, we’d need the unknown unknowns issue to be covered at some point. To me this seems simplest to add clearly and explicitly from the beginning. Otherwise I expect regulation to adapt to issues for which we have concrete new evidence, and to fail to adapt beyond that.
Granted that you’re not the median voter/decision-maker—but you’re certainly one of the most, if not the most, influential voice on the issue. It seems important not to underestimate your capacity to change people’s views before figuring out a compromise to aim for (I’m primarily thinking of government people, who seem more likely to have views that might change radically based on a few conversations). But I’m certainly no expert on this kind of thing.
I do wonder whether it might be helpful not to share all known problems publicly on this basis—I’d have somewhat more confidence in safety measures that succeeded in solving some problems of a type the designers didn’t know about.
The specific conversation is much better than nothing—but I do think it ought to be emphasized that solving all the problems we’re aware of isn’t sufficient for safety. We’re training on the test set.[1]
Our confidence levels should reflect that—but I expect overconfidence.
It’s plausible that RSPs could be net positive, but I think that given successful coordination [vague and uncertain] beats [significantly more concrete, but overconfident].
My presumption is that without good coordination (a necessary condition being cautious decision-makers), things will go badly.
RSPs seem likely to increase the odds we get some international coordination and regulation. But to get sufficient regulation, we’d need the unknown unknowns issue to be covered at some point. To me this seems simplest to add clearly and explicitly from the beginning. Otherwise I expect regulation to adapt to issues for which we have concrete new evidence, and to fail to adapt beyond that.
Granted that you’re not the median voter/decision-maker—but you’re certainly one of the most, if not the most, influential voice on the issue. It seems important not to underestimate your capacity to change people’s views before figuring out a compromise to aim for (I’m primarily thinking of government people, who seem more likely to have views that might change radically based on a few conversations). But I’m certainly no expert on this kind of thing.
I do wonder whether it might be helpful not to share all known problems publicly on this basis—I’d have somewhat more confidence in safety measures that succeeded in solving some problems of a type the designers didn’t know about.