I don’t think the problem is well posed. It will do whatever most effectively goes towards its terminal goal (supposing it to have one). Give it one goal and it will ignore making paperclips until 2025; give it another and it may prepare in advance to get the paperclip factory ready to go full on in 2025.
In the thought experiment description it is said that terminal goal is cups until new year’s eve and then changed to paperclips. And agent is aware of this change upfront. What do you find problematic with such setup?
No. Orthogonality is when agent follows any given goal, not when you give it. And as my thought experiment shows it is not intelligent to blindly follow given goal.
Goal preservation is mentioned in Instrumental Convergence.
So you choose 1st answer now?
I don’t think the problem is well posed. It will do whatever most effectively goes towards its terminal goal (supposing it to have one). Give it one goal and it will ignore making paperclips until 2025; give it another and it may prepare in advance to get the paperclip factory ready to go full on in 2025.
In the thought experiment description it is said that terminal goal is cups until new year’s eve and then changed to paperclips. And agent is aware of this change upfront. What do you find problematic with such setup?
If you can give the AGI any terminal goal you like, irrespective of how smart it is, that’s orthogonality right there.
No. Orthogonality is when agent follows any given goal, not when you give it. And as my thought experiment shows it is not intelligent to blindly follow given goal.