Aye, but the interesting question is, how do you teach an AI to discount this, without it then concluding that there is no threat from FAI, asteroid collisions, nano-technology, and black swam catastrophes?
The interestingness of the issue comes not from how a random person would commonsensically be able to resist the mugging, but in how you ACTUALLY define the mental traits that allow you to resist it. What you call a sense for bullshit is, more strictly a sense for claims that have extremely low probability. How does this sense work? How do would you go about defining it, and how would you go about making an AI with a sense for bullshit? This is CRUCIAL, and just saying that you won’t get fooled by this because of your sense for bullshit which seems internally simple doesn’t cut it.
What goes on in your head NEVER seems terribly complicated.
Speaking personally, that just isn’t true.
How do would you go about defining [bullshit], and how would you go about making an AI with a sense for bullshit?
You do that by solving the general-purpose inductive inference problem. Once you have a solution to that, the rest of the problem will gradually unravel.
Yeah, right. Anyone can attempt muggings this way. You don’t need bounded utility to defeat it, just a sense for bullshit.
Aye, but the interesting question is, how do you teach an AI to discount this, without it then concluding that there is no threat from FAI, asteroid collisions, nano-technology, and black swam catastrophes?
What went on in my head does not seem terribly complicated:
Hypothesis: he’s just saying that—so I will send him my money!
What other reason is there to believe him?
No reason.
Hypothesis promotion.
upvoted this and the parent because I think this is an issue that is useful to address. What goes on in your head NEVER seems terribly complicated. see: http://lesswrong.com/lw/no/how_an_algorithm_feels_from_inside/
The interestingness of the issue comes not from how a random person would commonsensically be able to resist the mugging, but in how you ACTUALLY define the mental traits that allow you to resist it. What you call a sense for bullshit is, more strictly a sense for claims that have extremely low probability. How does this sense work? How do would you go about defining it, and how would you go about making an AI with a sense for bullshit? This is CRUCIAL, and just saying that you won’t get fooled by this because of your sense for bullshit which seems internally simple doesn’t cut it.
Speaking personally, that just isn’t true.
You do that by solving the general-purpose inductive inference problem. Once you have a solution to that, the rest of the problem will gradually unravel.