It took me a good while reading this to figure out whether it was a deconstruction of tabooing words. I would have felt less so if the post didn’t keep replacing terms with ones that are both no less charged and also no more descriptive of the underlying system, and then start drawing conclusions from the resulting terms’ aesthetics.
With regards to Yudkowsky’s takes, the key thing to keep in mind is that Yudkowsky started down his path by reasoning backwards from properties ASI would have, not from reasoning forward from a particular implementation strategy. The key reason to be concerned that outer optimization doesn’t define inner optimization isn’t a specific hypothesis about whether some specific strategy with neural networks will have inner optimizers, it’s because ASI will by necessity involve active optimization on things, and we want our alignment techniques to have at least any reason to work in that regime at all.
It took me a good while reading this to figure out whether it was a deconstruction of tabooing words. I would have felt less so if the post didn’t keep replacing terms with ones that are both no less charged and also no more descriptive of the underlying system, and then start drawing conclusions from the resulting terms’ aesthetics.
With regards to Yudkowsky’s takes, the key thing to keep in mind is that Yudkowsky started down his path by reasoning backwards from properties ASI would have, not from reasoning forward from a particular implementation strategy. The key reason to be concerned that outer optimization doesn’t define inner optimization isn’t a specific hypothesis about whether some specific strategy with neural networks will have inner optimizers, it’s because ASI will by necessity involve active optimization on things, and we want our alignment techniques to have at least any reason to work in that regime at all.