I AM curious if you have any modeling more than “could be anything at all!” for the idea of an unknown goal.
No.
I could say—Christian God or aliens. And you would say—bullshit. And I would say—argument from ignorance. And you would say—I don’t have time for that.
So I won’t say.
We can approach this from different angle. Imagine an unknown goal that according to your beliefs AGI would really care about. And accept the fact that there is a possibility that it exists. Absense of evidence is not evidence of absense.
I think this may be our crux. Absence of evidence, in many cases, is evidence (not proof, but updateable Bayesean evidence) of absence. I think we agree that true goals are not fully introspectable by the agent. I think we disagree that there’s no distribution of goals that fit better than others, or whether there’s any evidence that can be used to understand goals, even if not fully understanding them at the source-code level.
No.
I could say—Christian God or aliens. And you would say—bullshit. And I would say—argument from ignorance. And you would say—I don’t have time for that.
So I won’t say.
We can approach this from different angle. Imagine an unknown goal that according to your beliefs AGI would really care about. And accept the fact that there is a possibility that it exists. Absense of evidence is not evidence of absense.
I think this may be our crux. Absence of evidence, in many cases, is evidence (not proof, but updateable Bayesean evidence) of absence. I think we agree that true goals are not fully introspectable by the agent. I think we disagree that there’s no distribution of goals that fit better than others, or whether there’s any evidence that can be used to understand goals, even if not fully understanding them at the source-code level.
Thanks for the discussion!
This conflicts with Gödel’s incompleteness theorems, Fitch’s paradox of knowability, Black swan theory.
A concept of experiment relies on this principle.
And this is exactly what scares me—people who work with AI have beliefs that are non scientific. I consider this to be an existential risk.
You may believe so, but AGI would not believe so.
Thanks to you too!