I think it is probably rare that a person’s intutions on something, in the absense of clear evidence, are very reliable. I was trying to think of ways to resolve these dilemmas given that, and came up with two ideas.
The first is to try and think of some test, preferably one that is easy and simple given your current capacities. If a test isn’t possible, you could try comparing what you THINK would happen at the end of the test, just to make sure you weren’t having a disagreement on words or holding a belief within a belief or something along those lines. That latter part could get sketchy, but if you found yourself defending your view against falsification, that’d be an indicator something is wrong.
The second would be to accept that neither of you is likely to be right in the absense of such a test. That isn’t enough, however: you should more concerned about how close each of you is to the right answer. Something like the following might be good for working that out:
Are there any areas where both your intuitions predict the same thing? If so, what OTHER solutions would hold that?
Is there another idea that could subsume both intuition spaces? It wouldn’t be exact, and exacthness IS a virtue, but it could help with deanchoring and searching the probability space.
Are your ideas conditional? (“I believe A will happen if B, and you believe C will happen if D”) If so, is there a more general idea that could explain each under its own conditions?
Again, you’d be looking for exactness here, and I think that finding a test is far preferable than simply comparing your intutiions, all things being equal.
In the case of AGI, simple tests could come from extant AIs, humans, or animals. These tests wouldn’t be perfect, one could ALWAYS object that we aren’t talking about AGI with these tests, but they could at least serve to direct our intuitions: how often do “properly raised” people/animals become “benevolent”? How often does “tool AI” currently engage in “non-benevolent” behavior? How successful are attempts to influence and/or encode values across species or systems? Obviously some disambiguation is necessary, but it seems like empirical tests guiding one’s intutitions creates the best case scenario for guiding accurate beliefs short of actually having the answer in front of you.
I think it is probably rare that a person’s intutions on something, in the absense of clear evidence, are very reliable. I was trying to think of ways to resolve these dilemmas given that, and came up with two ideas.
The first is to try and think of some test, preferably one that is easy and simple given your current capacities. If a test isn’t possible, you could try comparing what you THINK would happen at the end of the test, just to make sure you weren’t having a disagreement on words or holding a belief within a belief or something along those lines. That latter part could get sketchy, but if you found yourself defending your view against falsification, that’d be an indicator something is wrong.
The second would be to accept that neither of you is likely to be right in the absense of such a test. That isn’t enough, however: you should more concerned about how close each of you is to the right answer. Something like the following might be good for working that out:
Are there any areas where both your intuitions predict the same thing? If so, what OTHER solutions would hold that?
Is there another idea that could subsume both intuition spaces? It wouldn’t be exact, and exacthness IS a virtue, but it could help with deanchoring and searching the probability space.
Are your ideas conditional? (“I believe A will happen if B, and you believe C will happen if D”) If so, is there a more general idea that could explain each under its own conditions?
Again, you’d be looking for exactness here, and I think that finding a test is far preferable than simply comparing your intutiions, all things being equal.
In the case of AGI, simple tests could come from extant AIs, humans, or animals. These tests wouldn’t be perfect, one could ALWAYS object that we aren’t talking about AGI with these tests, but they could at least serve to direct our intuitions: how often do “properly raised” people/animals become “benevolent”? How often does “tool AI” currently engage in “non-benevolent” behavior? How successful are attempts to influence and/or encode values across species or systems? Obviously some disambiguation is necessary, but it seems like empirical tests guiding one’s intutitions creates the best case scenario for guiding accurate beliefs short of actually having the answer in front of you.