I don’t think the argument on hacking relied on the ability to formally verify systems. Formally verified systems could potentially skew the balance of power to the defender side, but even if they don’t exist, I don’t think balance is completely skewed to the attacker.
My point was not about the defender/attacker balance. My point was that even short-term goals can be difficult to specify, which undermines the notion that we can easily empower ourselves by short-term AI.
Of course we need to understand how to define “long term” and “short term” here. One way to think about this is the following: we can define various short-term metrics, which are evaluable using information in the short-term, and potentially correlated with long-term success. We would say that a strategy is purely long-term if it cannot be explained by making advances on any combination of these metrics.
Sort of. The correct way to make it more rigorous, IMO, is using tools from algorithmic information theory, like I suggested here.
My point was not about the defender/attacker balance. My point was that even short-term goals can be difficult to specify, which undermines the notion that we can easily empower ourselves by short-term AI.
Sort of. The correct way to make it more rigorous, IMO, is using tools from algorithmic information theory, like I suggested here.