On the one hand, there is no magical field that tells a code file whether the modifications coming into it are from me (human programmer) or the AI whose values that code file is. So, of course, if an AI can modify a text file, it can modify its source.
On the other hand, most likely the top goal on that value system is a fancy version of “I shall double never modify my value system”, so it shouldn’t do it.
On the one hand, there is no magical field that tells a code file whether the modifications coming into it are from me (human programmer) or the AI whose values that code file is. So, of course, if an AI can modify a text file, it can modify its source.
On the other hand, most likely the top goal on that value system is a fancy version of “I shall double never modify my value system”, so it shouldn’t do it.