Agreement karma indicates agreement, separate from overall quality.
I’m always intrigued by these experiments. If the box AI is not confirmed to be friendly, everything it says and promises is absolutely unreliable. I don’t see how the arguments of such an entity could be at all convincing.
But if you knew anything about the process leading up to the development of successful AI, you’d have some beliefs about how likely the AI is to perpetrate a ruse for the purpose of escaping.
But I get the difficulty: how well do you have to understand a being’s nature before you feel confident in predicting its motivations/values?
5 votes
Overall karma indicates overall quality.
0 votes
Agreement karma indicates agreement, separate from overall quality.
I’m always intrigued by these experiments. If the box AI is not confirmed to be friendly, everything it says and promises is absolutely unreliable. I don’t see how the arguments of such an entity could be at all convincing.
1 vote
Overall karma indicates overall quality.
0 votes
Agreement karma indicates agreement, separate from overall quality.
Good point.
But if you knew anything about the process leading up to the development of successful AI, you’d have some beliefs about how likely the AI is to perpetrate a ruse for the purpose of escaping.
But I get the difficulty: how well do you have to understand a being’s nature before you feel confident in predicting its motivations/values?
0 votes
Overall karma indicates overall quality.
0 votes
Agreement karma indicates agreement, separate from overall quality.
So the key to containing an AI is to have a technologically-ignorant rationalist babysit it?
0 votes
Overall karma indicates overall quality.
0 votes
Agreement karma indicates agreement, separate from overall quality.
Not more unreliable than the things humans say, and thereby convince you of.
4 votes
Overall karma indicates overall quality.
0 votes
Agreement karma indicates agreement, separate from overall quality.
Important difference: we can assume that other humans are probably like us.