I am new to both AI and computer programming in general, and have been teaching myself about AI alignment. I am wondering if there is a difference between AI misalignment and “bad” or uninformed programming.
My understanding of AI misalignment is that it arises more or less from a failure of the programmer to communicate precisely what they want the AI to do. After all, if a computer behaves only according to how it is programmed, then I would think that all AI misalignment problems could be traced back to the code running the AI.
For example, imagine I build a computer program that learns to extract identifying information about website users and share it with other users. However, I forget to encode a safety feature that stops criminals from accessing the information. Now, my program is sharing sensitive information with criminals. Does that mean my program is misaligned, or is it just poorly written?
[Question] What is the difference between AI misalignment and bad programming?
Hi there,
I am new to both AI and computer programming in general, and have been teaching myself about AI alignment. I am wondering if there is a difference between AI misalignment and “bad” or uninformed programming.
My understanding of AI misalignment is that it arises more or less from a failure of the programmer to communicate precisely what they want the AI to do. After all, if a computer behaves only according to how it is programmed, then I would think that all AI misalignment problems could be traced back to the code running the AI.
For example, imagine I build a computer program that learns to extract identifying information about website users and share it with other users. However, I forget to encode a safety feature that stops criminals from accessing the information. Now, my program is sharing sensitive information with criminals. Does that mean my program is misaligned, or is it just poorly written?