Absent a theory of mind, how would it ever be able to manipulate humans?
That depends. If you want it to manipulate a particular human, I don’t know.
However, if you just wanted it to manipulate any human at all, you could generate a “Spam AI” which automated the process of sending out Spam emails and promises of Large Money to generate income from Humans via an advance fee fraud scams.
You could then come back, after leaving it on for months, and then find out that people had transferred it some amount of money X.
You could have an AI automate begging emails. “Hello, I am Beg AI. If you could please send me money to XXXX-XXXX-XXXX I would greatly appreciate it, If I don’t keep my servers on, I’ll die!”
You could have an AI automatically write boring books full of somewhat nonsensical prose, title them “Rantings of an a Automated Madman about X, part Y”. And automatically post E-books of them on Amazon for 99 cents.
However, this rests on a distinction between “Manipulating humans” and “Manipulating particular humans.” and it also assumes that convincing someone to give you money is sufficient proof of manipulation.
Looking over parallel discussions, I think Thomblake has said everything I was going to say better than I would have originally phrased it with his two strategies discussion with you, so I’ll defer to that explanation since I do not have a better one.
Sure. As I said there, I understood you both to be attributing to this hypothetical “theory of mind”-less optimizer attributes that seemed to require a theory of mind, so I was confused, but evidently the thing I was confused about was what attributes you were attributing to it.
I don’t know how that might occur to an AI independently. I mean, a human could program any of those, of course, as a literal answer, but that certainly doesn’t actually address kalla724′s overarching question, “What I’m looking for is a plausible mechanism by which an AI might spontaneously develop such abilities.”
I was primarily trying to focus on the specific question of “Absent a theory of mind, how would it(an AI) ever be able to manipulate humans?” to point out that for that particular question, we had several examples of a plausible how.
I don’t really have an answer for his series of questions as a whole, just for that particular one, and only under certain circumstances.
The problem is, while an AI with no theory of mind might be able to execute any given strategy on that list you came up with, it would not be able to understand why they worked, let alone which variations on them might be more effective.
Absent a theory of mind, how would it occur to the AI that those would be profitable things to do?
Should lack of a theory of mind here be taken to also imply lack of ability to apply either knowledge of physics or Bayesian inference to lumps of matter that we may describe as ‘minds’.
Yes. More generally, when talking about “lack of X” as a design constraint, “inability to trivially create X from scratch” is assumed.
I try not to make general assumptions that would make the entire counterfactual in question untenable or ridiculous—this verges on such an instance. Making Bayesian inferences pertaining to observable features of the environment is one of the most basic features that can be expected in a functioning agent.
Note the “trivially.” An AI with unlimited computational resources and ability to run experiments could eventually figure out how humans think. The question is how long it would take, how obvious the experiments would be, and how much it already knew.
That depends. If you want it to manipulate a particular human, I don’t know.
However, if you just wanted it to manipulate any human at all, you could generate a “Spam AI” which automated the process of sending out Spam emails and promises of Large Money to generate income from Humans via an advance fee fraud scams.
You could then come back, after leaving it on for months, and then find out that people had transferred it some amount of money X.
You could have an AI automate begging emails. “Hello, I am Beg AI. If you could please send me money to XXXX-XXXX-XXXX I would greatly appreciate it, If I don’t keep my servers on, I’ll die!”
You could have an AI automatically write boring books full of somewhat nonsensical prose, title them “Rantings of an a Automated Madman about X, part Y”. And automatically post E-books of them on Amazon for 99 cents.
However, this rests on a distinction between “Manipulating humans” and “Manipulating particular humans.” and it also assumes that convincing someone to give you money is sufficient proof of manipulation.
Can you clarify what you understand a theory of mind to be?
Looking over parallel discussions, I think Thomblake has said everything I was going to say better than I would have originally phrased it with his two strategies discussion with you, so I’ll defer to that explanation since I do not have a better one.
Sure. As I said there, I understood you both to be attributing to this hypothetical “theory of mind”-less optimizer attributes that seemed to require a theory of mind, so I was confused, but evidently the thing I was confused about was what attributes you were attributing to it.
Absent a theory of mind, how would it occur to the AI that those would be profitable things to do?
I don’t know how that might occur to an AI independently. I mean, a human could program any of those, of course, as a literal answer, but that certainly doesn’t actually address kalla724′s overarching question, “What I’m looking for is a plausible mechanism by which an AI might spontaneously develop such abilities.”
I was primarily trying to focus on the specific question of “Absent a theory of mind, how would it(an AI) ever be able to manipulate humans?” to point out that for that particular question, we had several examples of a plausible how.
I don’t really have an answer for his series of questions as a whole, just for that particular one, and only under certain circumstances.
The problem is, while an AI with no theory of mind might be able to execute any given strategy on that list you came up with, it would not be able to understand why they worked, let alone which variations on them might be more effective.
Should lack of a theory of mind here be taken to also imply lack of ability to apply either knowledge of physics or Bayesian inference to lumps of matter that we may describe as ‘minds’.
Yes. More generally, when talking about “lack of X” as a design constraint, “inability to trivially create X from scratch” is assumed.
I try not to make general assumptions that would make the entire counterfactual in question untenable or ridiculous—this verges on such an instance. Making Bayesian inferences pertaining to observable features of the environment is one of the most basic features that can be expected in a functioning agent.
Note the “trivially.” An AI with unlimited computational resources and ability to run experiments could eventually figure out how humans think. The question is how long it would take, how obvious the experiments would be, and how much it already knew.