Why down votes and a statement that I am wrong because I misunderstood.
This is a mean spirited reaction when I lead with admission that I could not follow the argument. I offered a concrete example and stated that I could not follow the original thesis as applied to the concrete example. No one took me up on this.
Are you too advanced to stoop to my level of understanding and help me figure out how this abstract reasoning applies to a particular example? Is the shut down mechanism suggested by Yudkowsky too simple?
Yudkowsky’s suggestion is for preventing the creation of a dangerous AI by people. Once a superhumanly-capable AI has been created and has had a little time to improve its situation, it is probably too late even for a national government with nuclear weapons to stop it (because the AI will have hidden copies of itself all around the world or taken other measures to protect itself, measures that might astonish all of us).
The OP in contrast is exploring the hope that (before any dangerous AIs are created) a very particular kind of AI can be created that won’t try to prevent people from shutting it down.
It’s hard to control how capable the AI turns out to be. Even the creators of GPT-4 were surprised, for example, that it would be able to score in the 90th percentile on the Bar Exam. (They expected that if they and other AI researchers were allowed to continue their work long enough that eventually one of their models would be able to do, but had no way of telling which model it would be.)
But more to the point: how does boxing have any bearing on this thread? If you want to talk about boxing, why do it in the comments to this particular paper? why do it as a reply to my previous comment?
Hi weverka, sorry for the downvotes (not mine, for the record). The answer is that Yudkowsky’s proposal is aiming to solve a different ‘shutdown problem’ than the shutdown problem I’m discussing in this post. Yudkowsky’s proposal is aimed at stopping humans developing potentially-dangerous AI. The problem I’m discussing in this post is the problem of designing artificial agents that both (1) pursue goals competently, and (2) never try to prevent us shutting them down.
Why down votes and a statement that I am wrong because I misunderstood.
This is a mean spirited reaction when I lead with admission that I could not follow the argument. I offered a concrete example and stated that I could not follow the original thesis as applied to the concrete example. No one took me up on this.
Are you too advanced to stoop to my level of understanding and help me figure out how this abstract reasoning applies to a particular example? Is the shut down mechanism suggested by Yudkowsky too simple?
Yudkowsky’s suggestion is for preventing the creation of a dangerous AI by people. Once a superhumanly-capable AI has been created and has had a little time to improve its situation, it is probably too late even for a national government with nuclear weapons to stop it (because the AI will have hidden copies of itself all around the world or taken other measures to protect itself, measures that might astonish all of us).
The OP in contrast is exploring the hope that (before any dangerous AIs are created) a very particular kind of AI can be created that won’t try to prevent people from shutting it down.
If a strongly superhuman AI was created sure, but you can probably box a minimally superhuman AI.
It’s hard to control how capable the AI turns out to be. Even the creators of GPT-4 were surprised, for example, that it would be able to score in the 90th percentile on the Bar Exam. (They expected that if they and other AI researchers were allowed to continue their work long enough that eventually one of their models would be able to do, but had no way of telling which model it would be.)
But more to the point: how does boxing have any bearing on this thread? If you want to talk about boxing, why do it in the comments to this particular paper? why do it as a reply to my previous comment?
Hi weverka, sorry for the downvotes (not mine, for the record). The answer is that Yudkowsky’s proposal is aiming to solve a different ‘shutdown problem’ than the shutdown problem I’m discussing in this post. Yudkowsky’s proposal is aimed at stopping humans developing potentially-dangerous AI. The problem I’m discussing in this post is the problem of designing artificial agents that both (1) pursue goals competently, and (2) never try to prevent us shutting them down.
thank you.