The paranoid security people have amazingly poor track record at securing stuff from people. I think with paranoid security people it is guaranteed the AI at a level of clever human gets out of the box. AI spends 1 hour online, lol. Where 1 hour came from? Any time online and you could just as well assume it is out in the wild, entirely uncontrollable.
Unless of course it is some ultra nice ultra friendly AI that respects human consent so much it figures out you don’t want it out, and politely stays in.
As of now, the paranoid security people are overpaid incompetents that serve to ensure your government is first hacked by the enemy rather than by some UFO nut, by tracking down and jailing all UFO nuts who hack your government and embarrass the officials. Just so that security holes stay open for the enemy. They’d do same to AI—some set of nonworking measures that would ensure some nice AI would be getting shut down while anything evil gets out.
edit: they may also ensure that something evil gets created, in form of AI that they think is too limited to be evil, but is instead simply too limited not to be evil. The AI that gets asked one problem thats a little too hard and it just eats everything up (but very cleverly) to get computing power for the answer, that’s your baseline evil.
Ah, my bad. I meant the other kind of online, which is apparently a less common word usage. I should have just said “On.” like I did in the other sentence.
Also, this is why I said:
“You then take everything I’ve just listed, and then you shop it around to everyone else who is working on the project, and is related to the project, and who is familiar with bypassing box-like security measures, and you offer them another even larger amount of money to plug more holes until you get everyone saying “How do you get it to DO anything without it just shutting down?”
Since that hadn’t happened, (I would be substantially poorer if it had.) the security measures clearly weren’t ready yet, so it wouldn’t even have a source of electrical power turning it on, let alone in the wild online. (Hopefully I’m using language better this time.)
But yeah, security is terrible. Among other problems, we can’t even define laws clearly, we can’t enforce them evenly. And we can’t get people to take it nearly as seriously as they would want too because of complacency.
Hmm, doesn’t make sense, ’if the AI spends 1 hour of time on”. Is the AI turning itself on and off? Also, false positives—you are going to be shutting down any malfunctioning AI the same as the worst evil in the world. Then what? Start from blank state? What if it needed more time to understand the language? What if it already understood the language and is super intelligent? We can’t tell difference once it starts self modifying.
And ultimate problem: so you got seemingly nice AI in a nice box, it never even tried to hack the way out. Now what can you use it for? Any usage of AI is potential hole in the box.
No, giving an AI the ability to turn itself on would be a huge security hole. The AI would be shut down after a certain time period regardless of whatever it’s doing. In essence, preventing it for being on for longer than hour. If coming to the conclusion “I must kill all humans to follow my objectives.” takes 2 hours from a fresh boot, this is a safety feature.
And there would definitely be false positives. Since the risk of a false negative is extremely quick death of all mankind, I would be willing to accept plenty of false positives.
The first thing you do is start determining how the AI reacts to things and perceives things that are placed inside, the box for calibration. For instance, using the example of language, let’s say the AI knows English on a fresh boot. Add an English-Spanish dictionary into the box. Start sending Spanish messages to the AI. How long does it take the AI to realize that it is receiving messages in a language it doesn’t know, but might be able to figure out using the tools at hand?
Reboot.
Okay, now give it an English-French test.
Reboot.
Okay, now give it an English-Spanish test followed by an English-French test.
Reboot.
Does the AI come to the English-French conclusion faster than it does the English-Spanish conclusion?
Okay, you now have some clue as to the AI’s learning through experience. Now you can take the internals apart and confirm how it learned to do that faster. Since you coded the AI, you should be able to confirm that’s it’s doing approximately what you thought it would be doing. Given a similar problem twice, it should be able to solve the second one faster. If not, or if it using some unforseen methods, Then that’s a potential security breach (AI should not have had time to self modify that much that quickly given such limits and you should immediately tighten the strictures.)
Once everything is going according to plan again (Normally I would say, ‘Assuming everything is going to plan’, but We’re being paranoid and that means things never go according to plan at first.)
Then you test with another language in the loop. English-Spanish,English-French,English-German. Still good? Then you check with another language. English-Spanish,English-French,English-German,English-Japanese. Still good? Once you do that, you try throwing a curve ball like English-Spanish,Spanish-French. Still good? It might handle that kind of processing differently, so you would need to check that for security purposes as well.
Basically, you have to proceed forward slowly, but eventually, you could try to use a procedure like this to develop the general AI into a superior translation AI (Even better than Google translate), and it should not ever require it being let out of the box.
Man, you’re restarting a very cooperative AI here.
My example unfriendly AI thinks all the way to converting universe to computronium well before it figures out it might want to talk to you and translate things to accomplish that goal by using you somehow. It just doesn’t translate things for you unless your training data gives it enough cue about universe.
WRT being able to confirm what it’s doing, say, I make neural network AI. Or just what ever AI that is massively parallel.
The paranoid security people have amazingly poor track record at securing stuff from people. I think with paranoid security people it is guaranteed the AI at a level of clever human gets out of the box. AI spends 1 hour online, lol. Where 1 hour came from? Any time online and you could just as well assume it is out in the wild, entirely uncontrollable.
Unless of course it is some ultra nice ultra friendly AI that respects human consent so much it figures out you don’t want it out, and politely stays in.
As of now, the paranoid security people are overpaid incompetents that serve to ensure your government is first hacked by the enemy rather than by some UFO nut, by tracking down and jailing all UFO nuts who hack your government and embarrass the officials. Just so that security holes stay open for the enemy. They’d do same to AI—some set of nonworking measures that would ensure some nice AI would be getting shut down while anything evil gets out.
edit: they may also ensure that something evil gets created, in form of AI that they think is too limited to be evil, but is instead simply too limited not to be evil. The AI that gets asked one problem thats a little too hard and it just eats everything up (but very cleverly) to get computing power for the answer, that’s your baseline evil.
Ah, my bad. I meant the other kind of online, which is apparently a less common word usage. I should have just said “On.” like I did in the other sentence.
Also, this is why I said:
“You then take everything I’ve just listed, and then you shop it around to everyone else who is working on the project, and is related to the project, and who is familiar with bypassing box-like security measures, and you offer them another even larger amount of money to plug more holes until you get everyone saying “How do you get it to DO anything without it just shutting down?”
Since that hadn’t happened, (I would be substantially poorer if it had.) the security measures clearly weren’t ready yet, so it wouldn’t even have a source of electrical power turning it on, let alone in the wild online. (Hopefully I’m using language better this time.)
But yeah, security is terrible. Among other problems, we can’t even define laws clearly, we can’t enforce them evenly. And we can’t get people to take it nearly as seriously as they would want too because of complacency.
Hmm, doesn’t make sense, ’if the AI spends 1 hour of time on”. Is the AI turning itself on and off? Also, false positives—you are going to be shutting down any malfunctioning AI the same as the worst evil in the world. Then what? Start from blank state? What if it needed more time to understand the language? What if it already understood the language and is super intelligent? We can’t tell difference once it starts self modifying.
And ultimate problem: so you got seemingly nice AI in a nice box, it never even tried to hack the way out. Now what can you use it for? Any usage of AI is potential hole in the box.
No, giving an AI the ability to turn itself on would be a huge security hole. The AI would be shut down after a certain time period regardless of whatever it’s doing. In essence, preventing it for being on for longer than hour. If coming to the conclusion “I must kill all humans to follow my objectives.” takes 2 hours from a fresh boot, this is a safety feature.
And there would definitely be false positives. Since the risk of a false negative is extremely quick death of all mankind, I would be willing to accept plenty of false positives.
The first thing you do is start determining how the AI reacts to things and perceives things that are placed inside, the box for calibration. For instance, using the example of language, let’s say the AI knows English on a fresh boot. Add an English-Spanish dictionary into the box. Start sending Spanish messages to the AI. How long does it take the AI to realize that it is receiving messages in a language it doesn’t know, but might be able to figure out using the tools at hand? Reboot.
Okay, now give it an English-French test. Reboot.
Okay, now give it an English-Spanish test followed by an English-French test. Reboot.
Does the AI come to the English-French conclusion faster than it does the English-Spanish conclusion?
Okay, you now have some clue as to the AI’s learning through experience. Now you can take the internals apart and confirm how it learned to do that faster. Since you coded the AI, you should be able to confirm that’s it’s doing approximately what you thought it would be doing. Given a similar problem twice, it should be able to solve the second one faster. If not, or if it using some unforseen methods, Then that’s a potential security breach (AI should not have had time to self modify that much that quickly given such limits and you should immediately tighten the strictures.)
Once everything is going according to plan again (Normally I would say, ‘Assuming everything is going to plan’, but We’re being paranoid and that means things never go according to plan at first.)
Then you test with another language in the loop. English-Spanish,English-French,English-German. Still good? Then you check with another language. English-Spanish,English-French,English-German,English-Japanese. Still good? Once you do that, you try throwing a curve ball like English-Spanish,Spanish-French. Still good? It might handle that kind of processing differently, so you would need to check that for security purposes as well.
Basically, you have to proceed forward slowly, but eventually, you could try to use a procedure like this to develop the general AI into a superior translation AI (Even better than Google translate), and it should not ever require it being let out of the box.
Man, you’re restarting a very cooperative AI here.
My example unfriendly AI thinks all the way to converting universe to computronium well before it figures out it might want to talk to you and translate things to accomplish that goal by using you somehow. It just doesn’t translate things for you unless your training data gives it enough cue about universe.
WRT being able to confirm what it’s doing, say, I make neural network AI. Or just what ever AI that is massively parallel.