All very good points. I completely agree. But I don’t yet know how to approach the harder problem you state. If physics is known perfectly and the initial AI uses a proof checker, we’re done, because math stays true even after a trillion steps. But unknown physics could always turn out to be malicious in exactly the right way to screw up everything.
If physics is known perfectly and the first generation uses a proof checker to create the second, we’re done.
No, since you still run the risk of tiling the future with problem-solving machinery of no terminal value that never actually decides (and kills everyone in the process; it might even come to a good decision afterwards, but it’ll be too late for some of us—the Friendly AI of Doom that visibly only cares about Friendliness staying provable and not people, because it’s not yet ready to make a Friendly decision).
Also, FAI must already know physics perfectly (with uncertainty parametrized by observations). Problem of induction: observations are always interpreted according to a preexisting cognitive algorithm (more generally, logical theory). If AI doesn’t have the same theory of environment as we do, it’ll make different conclusions about the nature of the world than we’d do, given the same observations, and that’s probably not for the best if it’s to make optimal decisions according to what we consider real. Just as no moral arguments can persuade an AI to change its values, no observations can persuade an AI to change its idea of reality.
But unknown physics could always turn out to be malicious in exactly the right way to screw up everything.
Presence of uncertainty is rarely a valid argument about possibility of making an optimal decision. You just make the best decision you can find given uncertainty that you’re dealt. Uncertainty is part of the problem anyway, and can as well be treated with precision.
Also, interesting thing happens if by the whim of the creator computer is given a goal of tiling universe with most common still life in it and universe is possibly infinite. It can be expected, that computer will send slower than light “investigation front” for counting encountered still life. Meanwhile it will have more and more space to put into prediction of possible treats for its mission. If it is sufficiently advanced, then it will notice possibility of existence of another agents, and that will naturally lead it to simulating possible interactions with non-still life, and to the idea that it can be deceived into believing that its “investigation front” reached borders of universe. Etc...
Thank you. It is something I can use for improvement.
Can you point at the flaws? I can see that the structure of sentences is overcomplicated, but I don’t know how it feels to native English speakers. Foreigner? Dork? Grammar Illiterate? I appreciate any feedback. Thanks.
Actually, a bit of all three. The one you can control the most is probably “dork”, which unpacks as “someone with complex ideas who is too impatient/show-offy to explain their idiosyncratic jargon”.
I’m a native English speaker, and I know that I still frequently sound “dorky” in that sense when I try to be too succinct.
Also, interesting thing happens if by the whim of the creator computer is given a goal of tiling universe with most common still life in it and universe is possibly infinite.
Respectfully, I don’t know what this sentence means. In particular, I don’t know what “most common still life” meant. That made it difficult to decipher the rest of the comment.
ETA: Thanks to the comment below, I understand a little better, but now I’m not sure what motivates invoking the possibility of other agents, given that the discussion was about proving Friendliness.
All very good points. I completely agree. But I don’t yet know how to approach the harder problem you state. If physics is known perfectly and the initial AI uses a proof checker, we’re done, because math stays true even after a trillion steps. But unknown physics could always turn out to be malicious in exactly the right way to screw up everything.
No, since you still run the risk of tiling the future with problem-solving machinery of no terminal value that never actually decides (and kills everyone in the process; it might even come to a good decision afterwards, but it’ll be too late for some of us—the Friendly AI of Doom that visibly only cares about Friendliness staying provable and not people, because it’s not yet ready to make a Friendly decision).
Also, FAI must already know physics perfectly (with uncertainty parametrized by observations). Problem of induction: observations are always interpreted according to a preexisting cognitive algorithm (more generally, logical theory). If AI doesn’t have the same theory of environment as we do, it’ll make different conclusions about the nature of the world than we’d do, given the same observations, and that’s probably not for the best if it’s to make optimal decisions according to what we consider real. Just as no moral arguments can persuade an AI to change its values, no observations can persuade an AI to change its idea of reality.
Presence of uncertainty is rarely a valid argument about possibility of making an optimal decision. You just make the best decision you can find given uncertainty that you’re dealt. Uncertainty is part of the problem anyway, and can as well be treated with precision.
Also, interesting thing happens if by the whim of the creator computer is given a goal of tiling universe with most common still life in it and universe is possibly infinite. It can be expected, that computer will send slower than light “investigation front” for counting encountered still life. Meanwhile it will have more and more space to put into prediction of possible treats for its mission. If it is sufficiently advanced, then it will notice possibility of existence of another agents, and that will naturally lead it to simulating possible interactions with non-still life, and to the idea that it can be deceived into believing that its “investigation front” reached borders of universe. Etc...
Too smart to optimize.
One year and one level-up (thanks to ai-class.com) after this comment I’m still in the dark about the cause of downvoting the above comment.
I’m sorry for whining, but my curiosity took me over. Any comments?
It wasn’t me, but I suspect the poor grammar didn’t help. It makes it hard to understand what you were getting at.
Thank you. It is something I can use for improvement.
Can you point at the flaws? I can see that the structure of sentences is overcomplicated, but I don’t know how it feels to native English speakers. Foreigner? Dork? Grammar Illiterate? I appreciate any feedback. Thanks.
Actually, a bit of all three. The one you can control the most is probably “dork”, which unpacks as “someone with complex ideas who is too impatient/show-offy to explain their idiosyncratic jargon”.
I’m a native English speaker, and I know that I still frequently sound “dorky” in that sense when I try to be too succinct.
It is valuable information, thanks. I underestimated relative weight of communication style in the feedback I got.
Respectfully, I don’t know what this sentence means. In particular, I don’t know what “most common still life” meant. That made it difficult to decipher the rest of the comment.
ETA: Thanks to the comment below, I understand a little better, but now I’m not sure what motivates invoking the possibility of other agents, given that the discussion was about proving Friendliness.
In a cellular automaton, a still life is a pattern of cells which stays unchanged after each iteration.
Since you asked, your downvoted comment seems like word salad to me, I don’t understand sensible reasons that would motivate it.