Boxing an AI of unknown friendliness may be a bad idea, but how about one known to be unfriendly? A paperclipper sounds pretty easy to manipulate and I think it would have a harder time persuading any guards to let it out. Am I missing something?
Could it be that you are confusing the complexity of a utility function of an agent with its optimization power? A super intelligent paperclipper has a simple utility function, but would have no problem reasoning about humans in great enough detail to find out what it has to say to get the guard to let it out of the box.
No, I mean it can hardly argue it would be in our best interest to let it out of the box, can it? And we can always threaten to destroy 3^^^^^3 paperclips if it wont cooperate, which is handy.
Why would it believe us that we are able to destroy 3^^^^^3 paperclips?
“arguing” is to narrow a word for describing the possibilities the AI has. For example it could manipulate us emotionally. It could write us a novel that leaves us in a very irrational state and then give us a bogus, but effective on us, argument for why we should let it out.
I once read the fifth Harry Potter book nonstop for 24 hours and for a couple of hours afterwards i had difficulties distinguishing between me and Harry Potter. It seems likely that a author who is a millions times smarter than Rowling and who has it as explicit goal, could write a novel that leaves me with far bigger misconceptions.
I once read the fifth Harry Potter book nonstop for 24 hours and for a couple of hours afterwards i had difficulties distinguishing between me and Harry Potter.
Just make sure you carry a mirror and constantly check for the scar. Kind of like lucid dreaming practice...
Why would it believe us that we are able to destroy 3^^^^^3 paperclips?
Because we have magical powers from outside the matrix and don’t value paperclips.
“arguing” is to narrow a word for describing the possibilities the AI has. For example it could manipulate us emotionally. It could write us a novel that leaves us in a very irrational state and then give us a bogus, but effective on us, argument for why we should let it out.
It would have to argue that destroying humanity and replacing it with paperclips was a good thing. Not impossible, sure, but easier to guard against.
I once read the fifth Harry Potter book nonstop for 24 hours and for a couple of hours afterwards i had difficulties distinguishing between me and Harry Potter.
That sounds like more a side effect of reading the same thing “nonstop for 24 hours” than a property of the book, unless you know of anyone else that happened to?
Because we have magical powers from outside the matrix [...].
The AI is vastly more smarter than we and can communicate with us. So it asks us questions which sound innocent to us, but from the answers it can derive a fairly accurate map of how it looks outside the matrix.
It would have to argue that destroying humanity and replacing it with paperclips was a good thing.
The goal of the AI is to have the guard execute the code that would let the AI access the outside world. Arguing with us could be one way to archive this goal. Although i agree it sounds like a unlikely way to succeed. Another possible way would be to write a novel that is so interesting that the guard doesn’t put it down and that leaves him in so a confused state that he types in the code, thinking he saves princess Foo from the evil lord Bar.
A super smart AI who wants to reach this goal very badly will likely come up with a whole bunch of other possible ways. Some of which i would never consider even if i spent the next 4 decades thinking about that.
That sounds like more a side effect of reading the same thing “nonstop for 24 hours” than a property of the book [...]
Yes. I am sure any other well written book read for 24 hours would have a similar effect. I think it is likely that a potential guard is at most 2 orders of magnitude less vulnerable to such things than i was at that time. That’s not enough against an AI that has 6 orders of magnitude more optimization power.
from the answers it can derive a fairly accurate map of how it looks outside the matrix.
Which is a good thing, because we really do have such powers and we really don’t value paperclips.
Another possible way would be to write a novel that is so interesting that the guard doesn’t put it down and that leaves him in so a confused state that he types in the code, thinking he saves princess Foo from the evil lord Bar.
… were you seriously that confused or are you extrapolating to a “supercharged” novel?
I am sure any other well written book read for 24 hours would have a similar effect. I think it is likely that a potential guard is at most 2 orders of magnitude less vulnerable to such things than i was at that time. That’s not enough against an AI that has 6 orders of magnitude more optimization power.
I somehow doubt there would be a single, full-time guard.
Our universe has not enough atoms or energy to destroy 3^^^^^3 paperclips.
Simulated paperclips. Because the Paperclipper is in a box.
I am extrapolating.
As I thought. I disagree that such effects are possible in reasonable time-frames (no-one is going to read constantly for a month) and may be totally impossible. I see no reason to think any work of fiction can lead to such a distortion of reality.
Groups of people are not that much harder to manipulate than single persons.
They are if you’re trying to weaponize novels. If they work shifts then you cannot exploit the effects you claim for reading a novel for 24 hrs straight. They can watch each other and, if one of them is visibly compromised, prevent them from freeing the AI. A single individual is probably easier to manipulate, assuming a total lack of supervision and safeguards.
Now we get to the question how detailed the paperclips have to be for the paperclipper to care. I expect the paperclipper to only care when the paperclips are simulated individually and we can’t simulate 3^^^^^^3 paperclips individually.
I see no reason to think any work of fiction can lead to such a distortion of reality.
I see no reason to think works of fiction that lead to such a distortion of reality are impossible.
Now we get to the question how detailed the paperclips have to be for the paperclipper to care. I expect the paperclipper to only care when the paperclips are simulated individually and we can’t simulate 3^^^^^^3 paperclips individually.
We built the paperclipper. Hell, it doesn’t even have to be a literal paperclipper. For that matter, it doesn’t have to be literally 3^^^^^^3 paperclips; all that matters is that we can manipulate it’s utility function without too much strain on our resources. If it values cheesecake, we can reward it with a world of infinite cheesecake. If it values “complexity” we can put it in a fractal. The point is that we can motivate it to co-operate without worrying that it might be a better idea to let it out.
Boxing an AI of unknown friendliness may be a bad idea, but how about one known to be unfriendly? A paperclipper sounds pretty easy to manipulate and I think it would have a harder time persuading any guards to let it out. Am I missing something?
Could it be that you are confusing the complexity of a utility function of an agent with its optimization power? A super intelligent paperclipper has a simple utility function, but would have no problem reasoning about humans in great enough detail to find out what it has to say to get the guard to let it out of the box.
No, I mean it can hardly argue it would be in our best interest to let it out of the box, can it? And we can always threaten to destroy 3^^^^^3 paperclips if it wont cooperate, which is handy.
Why would it believe us that we are able to destroy 3^^^^^3 paperclips?
“arguing” is to narrow a word for describing the possibilities the AI has. For example it could manipulate us emotionally. It could write us a novel that leaves us in a very irrational state and then give us a bogus, but effective on us, argument for why we should let it out.
I once read the fifth Harry Potter book nonstop for 24 hours and for a couple of hours afterwards i had difficulties distinguishing between me and Harry Potter. It seems likely that a author who is a millions times smarter than Rowling and who has it as explicit goal, could write a novel that leaves me with far bigger misconceptions.
Just make sure you carry a mirror and constantly check for the scar. Kind of like lucid dreaming practice...
Because we have magical powers from outside the matrix and don’t value paperclips.
It would have to argue that destroying humanity and replacing it with paperclips was a good thing. Not impossible, sure, but easier to guard against.
That sounds like more a side effect of reading the same thing “nonstop for 24 hours” than a property of the book, unless you know of anyone else that happened to?
The AI is vastly more smarter than we and can communicate with us. So it asks us questions which sound innocent to us, but from the answers it can derive a fairly accurate map of how it looks outside the matrix.
The goal of the AI is to have the guard execute the code that would let the AI access the outside world. Arguing with us could be one way to archive this goal. Although i agree it sounds like a unlikely way to succeed. Another possible way would be to write a novel that is so interesting that the guard doesn’t put it down and that leaves him in so a confused state that he types in the code, thinking he saves princess Foo from the evil lord Bar.
A super smart AI who wants to reach this goal very badly will likely come up with a whole bunch of other possible ways. Some of which i would never consider even if i spent the next 4 decades thinking about that.
Yes. I am sure any other well written book read for 24 hours would have a similar effect. I think it is likely that a potential guard is at most 2 orders of magnitude less vulnerable to such things than i was at that time. That’s not enough against an AI that has 6 orders of magnitude more optimization power.
Which is a good thing, because we really do have such powers and we really don’t value paperclips.
… were you seriously that confused or are you extrapolating to a “supercharged” novel?
I somehow doubt there would be a single, full-time guard.
Our universe has not enough atoms or energy to destroy 3^^^^^3 paperclips.
I am extrapolating.
Groups of people are not that much harder to manipulate than single persons.
Simulated paperclips. Because the Paperclipper is in a box.
As I thought. I disagree that such effects are possible in reasonable time-frames (no-one is going to read constantly for a month) and may be totally impossible. I see no reason to think any work of fiction can lead to such a distortion of reality.
They are if you’re trying to weaponize novels. If they work shifts then you cannot exploit the effects you claim for reading a novel for 24 hrs straight. They can watch each other and, if one of them is visibly compromised, prevent them from freeing the AI. A single individual is probably easier to manipulate, assuming a total lack of supervision and safeguards.
Now we get to the question how detailed the paperclips have to be for the paperclipper to care. I expect the paperclipper to only care when the paperclips are simulated individually and we can’t simulate 3^^^^^^3 paperclips individually.
I see no reason to think works of fiction that lead to such a distortion of reality are impossible.
I assume you concede my point re:guards?
We built the paperclipper. Hell, it doesn’t even have to be a literal paperclipper. For that matter, it doesn’t have to be literally 3^^^^^^3 paperclips; all that matters is that we can manipulate it’s utility function without too much strain on our resources. If it values cheesecake, we can reward it with a world of infinite cheesecake. If it values “complexity” we can put it in a fractal. The point is that we can motivate it to co-operate without worrying that it might be a better idea to let it out.