Maybe one of the problem of the idea of the “alignment” is that is named as a noun and thus we describe it as a thing which could actually exist, while, in fact, it is only a high-level description of some form of hypothetical relation of two complex systems. In that case, it is not a “liquid” and can’t be “distilled”. I will illustrate this consideration by the following example:
Imagine that I can safely drive a bike at the speed of 20 km/h and after some training I could extend my safe speed on 1 km/h, so it is reasonable to conclude that I could distill “safe driving” to 21 km/h. Repeating this process, I could reach higher and higher speed of biking. However, it is also obvious that I will have a fatal crash somewhere between 100 and 200 km/h. The reason for it is that on the higher speeds the probability of accidents is exponentially growing. The “accidents” are the real thing, but not “safety” which is only a high-level description of driving habits.
Conclusion: Accidents can be avoided by not riding a bike or limiting bike’s speed, but safety can’t be unlimitedly stretch. Thus AI development should not be “safety” or “alignment” oriented, but disaster avoidance oriented.
Maybe one of the problem of the idea of the “alignment” is that is named as a noun and thus we describe it as a thing which could actually exist, while, in fact, it is only a high-level description of some form of hypothetical relation of two complex systems. In that case, it is not a “liquid” and can’t be “distilled”. I will illustrate this consideration by the following example:
Imagine that I can safely drive a bike at the speed of 20 km/h and after some training I could extend my safe speed on 1 km/h, so it is reasonable to conclude that I could distill “safe driving” to 21 km/h. Repeating this process, I could reach higher and higher speed of biking. However, it is also obvious that I will have a fatal crash somewhere between 100 and 200 km/h. The reason for it is that on the higher speeds the probability of accidents is exponentially growing. The “accidents” are the real thing, but not “safety” which is only a high-level description of driving habits.
Conclusion: Accidents can be avoided by not riding a bike or limiting bike’s speed, but safety can’t be unlimitedly stretch. Thus AI development should not be “safety” or “alignment” oriented, but disaster avoidance oriented.