I got the impression that the serious problems were related to goals and friendliness. I wouldn’t have expected such a system having much problem making itself run faster or learning how to hack once prompted by its best known source of friendliness advice.
I was thinking of a “Seed AGI” in the process of growing that has hit some kind of goal restriction or strong discouragement to further self improvement that was intended as a safety feature—i.e “Don’t make yourself smarter without permission under condition X”
That does sound tricky. The best option available seems to be “Eliezer, here is $1,000,000. This is the address. Do what you have to do.” But I presume there is a restriction in place about earning money?
A sufficiently clever AI could probably find legal ways to create wealth for someone—and if the AI is supposed to be able to help other people, whatever restriction prevents it from earning its own cash must have a fairly vast loophole.
If the AI is not allowed to do anything which would increase the total monetary wealth of the world … that would create staggering levels of conflicts and inconsistencies with any code that demanded that it help people. If you help someone, then you place them in a better position than they were in before, which is quite likely to mean that they will produce more wealth in the world than they would before.
I still agree. I allow the inconvenient world to stand because the ability to supply cash for a hit wasn’t central to my point and there are plenty of limitations that badger could have in place that make the mentioned $1,000,000 transaction non-trivial.
I got the impression that the serious problems were related to goals and friendliness. I wouldn’t have expected such a system having much problem making itself run faster or learning how to hack once prompted by its best known source of friendliness advice.
I was thinking of a “Seed AGI” in the process of growing that has hit some kind of goal restriction or strong discouragement to further self improvement that was intended as a safety feature—i.e “Don’t make yourself smarter without permission under condition X”
That does sound tricky. The best option available seems to be “Eliezer, here is $1,000,000. This is the address. Do what you have to do.” But I presume there is a restriction in place about earning money?
A sufficiently clever AI could probably find legal ways to create wealth for someone—and if the AI is supposed to be able to help other people, whatever restriction prevents it from earning its own cash must have a fairly vast loophole.
I agree, although I allow somewhat for an inconvenient possible world.
If the AI is not allowed to do anything which would increase the total monetary wealth of the world … that would create staggering levels of conflicts and inconsistencies with any code that demanded that it help people. If you help someone, then you place them in a better position than they were in before, which is quite likely to mean that they will produce more wealth in the world than they would before.
I still agree. I allow the inconvenient world to stand because the ability to supply cash for a hit wasn’t central to my point and there are plenty of limitations that badger could have in place that make the mentioned $1,000,000 transaction non-trivial.