depending on how seriously you take it, such analysis might still take place, hard and prolonged though it may be).
Yup, agreed that it might. And agreed that it might succeed, if it does take place.
One can have “detectors” in place set to find specific behaviors, but these would have assumptions that could easily fail. Detectors that would still be useful would be macro ones—where it tries to access and how—but these would provide only limited insight into the AI’s thought process.
Agreed on all counts.
Re: what the AI knows… I’m not sure how to move forward here. Perhaps what’s necessary is a step backwards.
If I’ve understood you correctly, you consider “having a conversation” to encompass exchanges such as: A: “What day is it?” B: “Na ni noo na”
If that’s true, then sure, I agree that the minimal set of information about humans required to do that is zero; hell, I can do that with the rain. And I agree that a system that’s capable of doing that (e.g., the rain) is sufficiently unlikely to be capable of effective deception that the hypothesis isn’t even worthy of consideration. I also suggest that we stop using the phrase “having a conversation” at all, because it does not convey anything meaningful.
Having said that… for my own part, I initially understood you to be talking about a system capable of exchanges like:
A: “What day is it?” B: “Day seventeen.” A: “Why do you say that?” B: “Because I’ve learned that ‘a day’ refers to a particular cycle of activity in the lab, and I have observed seventeen such cycles.”
A system capable of doing that, I maintain, already knows enough about humans that I expect it to be capable of deception. (The specific questions and answers don’t matter to my point, I can choose others if you prefer.)
My point was that the AI is likely to start performing social experiments well before it is capable of even that conversation you depicted. It wouldn’t know how much it doesn’t know about humans.
And I agree that humans might be able to detect attempts at deception in a system at that stage of its development. I’m not vastly confident of it, though.
I have likewise adjusted down my confidence that this would be as easy or as inevitable as I previously anticipated. Thus I would no longer say I am “vastly confident” in it, either.
Still good to have this buffer between making an AI and total global catastrophe, though!
Yup, agreed that it might.
And agreed that it might succeed, if it does take place.
Agreed on all counts.
Re: what the AI knows… I’m not sure how to move forward here. Perhaps what’s necessary is a step backwards.
If I’ve understood you correctly, you consider “having a conversation” to encompass exchanges such as:
A: “What day is it?”
B: “Na ni noo na”
If that’s true, then sure, I agree that the minimal set of information about humans required to do that is zero; hell, I can do that with the rain.
And I agree that a system that’s capable of doing that (e.g., the rain) is sufficiently unlikely to be capable of effective deception that the hypothesis isn’t even worthy of consideration.
I also suggest that we stop using the phrase “having a conversation” at all, because it does not convey anything meaningful.
Having said that… for my own part, I initially understood you to be talking about a system capable of exchanges like: A: “What day is it?”
B: “Day seventeen.”
A: “Why do you say that?”
B: “Because I’ve learned that ‘a day’ refers to a particular cycle of activity in the lab, and I have observed seventeen such cycles.”
A system capable of doing that, I maintain, already knows enough about humans that I expect it to be capable of deception. (The specific questions and answers don’t matter to my point, I can choose others if you prefer.)
My point was that the AI is likely to start performing social experiments well before it is capable of even that conversation you depicted. It wouldn’t know how much it doesn’t know about humans.
(nods) Likely.
And I agree that humans might be able to detect attempts at deception in a system at that stage of its development. I’m not vastly confident of it, though.
I have likewise adjusted down my confidence that this would be as easy or as inevitable as I previously anticipated. Thus I would no longer say I am “vastly confident” in it, either.
Still good to have this buffer between making an AI and total global catastrophe, though!
Sure… a process with an N% chance of global catastrophic failure is definitely better than a process with N+delta% chance.