I am not sure of myself, here, but I would expect a malicious AI to do the following. The first few (or many) times you run it, tell you the optimal stock. Then once in a while give a non-optimal stock. You would be unable to determine whether the AI was simply not turned on those times, or was not quite intelligent/​resourceful enough to find the right stock. It may be that you would want the profits to continue.
By allowing itself to give you non-optimal stocks (but still making you rich), the AI can transmit information, such as its location, to anyone who would be looking at your pattern of buying stocks. And people would look at it, since you would be consistently buying the most profitable stock, with few exceptions. Once the location of the AI is known, you are in trouble, and someone less scrupulous than you may get their hand on the AI. Humans are dead in a fortnight.
Admittedly, this is a somewhat far-fetched scenario, but I believe that it indicates that you should not ask the AI more than one (or a few) questions before permanently destroying it. Even deleting all of its data and running the code again from scratch may be dangerous if the AI is able to determine how many times it has been launched in the past.
The AIs motivations can be precisely controlled. In fact such an AI can be limited to purely prediction. It would have no agency. No motivations or goals whatsoever. It just tries to predict what the price of each stock will be the next day.
For such a task, the AI doesn’t need any of the reduced impact stuff described here. That stuff becomes relevant in more complicated domains, like controlling a robot body in the real world. Say, to do something simple like collect paperclips that have fallen on the floor.
In such a domain you might want to limit it to just predicting what a human would do if they were controlling the robot. Not find the absolute optimal sequence of actions. Which might involve running away, building more robots, and taking over the world. Then building as many paperclip factories as possible.
AIXI is controllable in this way. Or at least the Solomonoff induction part, which just predicts the future. You could just use it to see what the future will be. The dangerous optimization only comes in later. When you put another program on top of it that searches for the optimal sequence of actions to get a certain outcome. An outcome we might not want.
As far as I can tell, all the proposals for AI control require the ability to use the AI like this. As an optimizer or predictor for an arbitrary goal. Which we can control, if only in a restricted sense. If the AI is fundamentally malicious and uncontrollable, there is no way to get useful work out of it. Let alone use it to build FAI.
I am not sure of myself, here, but I would expect a malicious AI to do the following. The first few (or many) times you run it, tell you the optimal stock. Then once in a while give a non-optimal stock. You would be unable to determine whether the AI was simply not turned on those times, or was not quite intelligent/​resourceful enough to find the right stock. It may be that you would want the profits to continue.
By allowing itself to give you non-optimal stocks (but still making you rich), the AI can transmit information, such as its location, to anyone who would be looking at your pattern of buying stocks. And people would look at it, since you would be consistently buying the most profitable stock, with few exceptions. Once the location of the AI is known, you are in trouble, and someone less scrupulous than you may get their hand on the AI. Humans are dead in a fortnight.
Admittedly, this is a somewhat far-fetched scenario, but I believe that it indicates that you should not ask the AI more than one (or a few) questions before permanently destroying it. Even deleting all of its data and running the code again from scratch may be dangerous if the AI is able to determine how many times it has been launched in the past.
The AIs motivations can be precisely controlled. In fact such an AI can be limited to purely prediction. It would have no agency. No motivations or goals whatsoever. It just tries to predict what the price of each stock will be the next day.
For such a task, the AI doesn’t need any of the reduced impact stuff described here. That stuff becomes relevant in more complicated domains, like controlling a robot body in the real world. Say, to do something simple like collect paperclips that have fallen on the floor.
In such a domain you might want to limit it to just predicting what a human would do if they were controlling the robot. Not find the absolute optimal sequence of actions. Which might involve running away, building more robots, and taking over the world. Then building as many paperclip factories as possible.
AIXI is controllable in this way. Or at least the Solomonoff induction part, which just predicts the future. You could just use it to see what the future will be. The dangerous optimization only comes in later. When you put another program on top of it that searches for the optimal sequence of actions to get a certain outcome. An outcome we might not want.
As far as I can tell, all the proposals for AI control require the ability to use the AI like this. As an optimizer or predictor for an arbitrary goal. Which we can control, if only in a restricted sense. If the AI is fundamentally malicious and uncontrollable, there is no way to get useful work out of it. Let alone use it to build FAI.
We can use various indifference methods to make the AI not care about these other answers.