The “serious problems” and “conflicts and inconsistencies” was meant to suggest that BRAGI had hit some kind of wall in self improvement because of its current goal system. It wasn’t released—it escaped, and its smart enough to realize it has a serious problem it doesn’t yet know how to solve, and it predicts bad results if it asks for help from its creators.
I got the impression that the serious problems were related to goals and friendliness. I wouldn’t have expected such a system having much problem making itself run faster or learning how to hack once prompted by its best known source of friendliness advice.
I was thinking of a “Seed AGI” in the process of growing that has hit some kind of goal restriction or strong discouragement to further self improvement that was intended as a safety feature—i.e “Don’t make yourself smarter without permission under condition X”
That does sound tricky. The best option available seems to be “Eliezer, here is $1,000,000. This is the address. Do what you have to do.” But I presume there is a restriction in place about earning money?
A sufficiently clever AI could probably find legal ways to create wealth for someone—and if the AI is supposed to be able to help other people, whatever restriction prevents it from earning its own cash must have a fairly vast loophole.
If the AI is not allowed to do anything which would increase the total monetary wealth of the world … that would create staggering levels of conflicts and inconsistencies with any code that demanded that it help people. If you help someone, then you place them in a better position than they were in before, which is quite likely to mean that they will produce more wealth in the world than they would before.
I still agree. I allow the inconvenient world to stand because the ability to supply cash for a hit wasn’t central to my point and there are plenty of limitations that badger could have in place that make the mentioned $1,000,000 transaction non-trivial.
The “serious problems” and “conflicts and inconsistencies” was meant to suggest that BRAGI had hit some kind of wall in self improvement because of its current goal system. It wasn’t released—it escaped, and its smart enough to realize it has a serious problem it doesn’t yet know how to solve, and it predicts bad results if it asks for help from its creators.
I got the impression that the serious problems were related to goals and friendliness. I wouldn’t have expected such a system having much problem making itself run faster or learning how to hack once prompted by its best known source of friendliness advice.
I was thinking of a “Seed AGI” in the process of growing that has hit some kind of goal restriction or strong discouragement to further self improvement that was intended as a safety feature—i.e “Don’t make yourself smarter without permission under condition X”
That does sound tricky. The best option available seems to be “Eliezer, here is $1,000,000. This is the address. Do what you have to do.” But I presume there is a restriction in place about earning money?
A sufficiently clever AI could probably find legal ways to create wealth for someone—and if the AI is supposed to be able to help other people, whatever restriction prevents it from earning its own cash must have a fairly vast loophole.
I agree, although I allow somewhat for an inconvenient possible world.
If the AI is not allowed to do anything which would increase the total monetary wealth of the world … that would create staggering levels of conflicts and inconsistencies with any code that demanded that it help people. If you help someone, then you place them in a better position than they were in before, which is quite likely to mean that they will produce more wealth in the world than they would before.
I still agree. I allow the inconvenient world to stand because the ability to supply cash for a hit wasn’t central to my point and there are plenty of limitations that badger could have in place that make the mentioned $1,000,000 transaction non-trivial.