The aligned goal should be “formal”. It should be made of fully formalized math, not of human concepts that an AI has to interpret in its ontology, because ontologies break and reshape as the AI learns and changes.
hello there!
I think there are alignment problems that cannot be solved by relying on formalized math—like there is no formal equation on how an emergency action robot can compute for saving a person from a burning house.
I think it is still superior to align AI systems with aligned patterns instead.
Strong agree. I don’t personally use (much) math when I reason about moral philosophy, so I’m pessimistic about being able to somehow teach an AI to use math in order to figure out how to be good.
If I can reduce my own morality into a formula and feel confident that I personally will remain good if I blindly obey that formula, then sure, that seems like a thing to teach the AI. However, I know my morality relies on fuzzy feature-recognition encoded in population vectors which cannot efficiently be compressed into simple math. Thus, if the formula doesn’t even work for my own decisions, I don’t expect it to work for the AI.
hello there!
I think there are alignment problems that cannot be solved by relying on formalized math—like there is no formal equation on how an emergency action robot can compute for saving a person from a burning house.
I think it is still superior to align AI systems with aligned patterns instead.
Thank you,
Strong agree. I don’t personally use (much) math when I reason about moral philosophy, so I’m pessimistic about being able to somehow teach an AI to use math in order to figure out how to be good.
If I can reduce my own morality into a formula and feel confident that I personally will remain good if I blindly obey that formula, then sure, that seems like a thing to teach the AI. However, I know my morality relies on fuzzy feature-recognition encoded in population vectors which cannot efficiently be compressed into simple math. Thus, if the formula doesn’t even work for my own decisions, I don’t expect it to work for the AI.