Re Regulatory markets for AI safety: You say that the proposal doesn’t seem likely to work if “alignment is really hard and we only get one shot at it” (i.e. unbounded maximiser with discontinuous takeoff). Do you expect that status-quo government regulation would do any better, or just that any regulation wouldn’t be helpful in such a scenario? My intuition is that even if alignment is really hard, regulation could be helpful e.g. by reducing races to the bottom, and I’d rather have a more informed group (like people from a policy and technical safety team at a top lab) implementing it instead of a less-informed government agency. I’m also not sure what you mean by legible regulation.
I agree that regulation could be helpful by reducing races to the bottom; I think what I was getting at here (which I might be wrong about, as it was several months ago) was that it is hard to build regulations that directly attack the technical problem. Consider for example the case for car manufacturing. You could have two types of regulations:
Regulations that provide direct evidence of safety: For example, you could require that all car designs be put through a battery of safety tests, e.g. crashing them into a wall and ensuring that the airbags deploy.
Regulations that provide evidence of thinking about safety: For example, you could require that all car designs have at least 5 person-years of safety analysis done by people with a degree in Automotive Safety (which is probably not an actual field but in theory could be one).
Iirc, the regulatory markets paper seemed to have most of its optimism on the first kind of regulation, or at least that’s how I interpreted it. That kind of regulation seems particularly hard in the one-shot alignment case. The second kind of regulation seems much more possible to do in all scenarios, and preventing races to the bottom is an example of that kind of regulation.
I’m not sure what I meant by legible regulation—probably I was just emphasizing the fact that for regulations to be good, they need to be sufficiently clear and understood by companies so that they can actually be in compliance with them. Again, for regulations of the first kind this seems pretty hard to do.
Re Regulatory markets for AI safety: You say that the proposal doesn’t seem likely to work if “alignment is really hard and we only get one shot at it” (i.e. unbounded maximiser with discontinuous takeoff). Do you expect that status-quo government regulation would do any better, or just that any regulation wouldn’t be helpful in such a scenario? My intuition is that even if alignment is really hard, regulation could be helpful e.g. by reducing races to the bottom, and I’d rather have a more informed group (like people from a policy and technical safety team at a top lab) implementing it instead of a less-informed government agency. I’m also not sure what you mean by legible regulation.
I agree that regulation could be helpful by reducing races to the bottom; I think what I was getting at here (which I might be wrong about, as it was several months ago) was that it is hard to build regulations that directly attack the technical problem. Consider for example the case for car manufacturing. You could have two types of regulations:
Regulations that provide direct evidence of safety: For example, you could require that all car designs be put through a battery of safety tests, e.g. crashing them into a wall and ensuring that the airbags deploy.
Regulations that provide evidence of thinking about safety: For example, you could require that all car designs have at least 5 person-years of safety analysis done by people with a degree in Automotive Safety (which is probably not an actual field but in theory could be one).
Iirc, the regulatory markets paper seemed to have most of its optimism on the first kind of regulation, or at least that’s how I interpreted it. That kind of regulation seems particularly hard in the one-shot alignment case. The second kind of regulation seems much more possible to do in all scenarios, and preventing races to the bottom is an example of that kind of regulation.
I’m not sure what I meant by legible regulation—probably I was just emphasizing the fact that for regulations to be good, they need to be sufficiently clear and understood by companies so that they can actually be in compliance with them. Again, for regulations of the first kind this seems pretty hard to do.