In the same way people’s minds do. They are inconsistent but will notice the setup very quickly and stop. (I don’t find Dutch book arguments very convincing, really).
(a) In what settings do you want an architecture like that, and
(b) Ethics dictate we don’t just want to replace entities for the sake of efficiency even if they disagree. This leads to KILL ALL HUMANS. So, we might get an architecture like that due to how history played out. And then it’s just a brute fact.
I am guessing (a) has to do with “robustness” (I am not prepared to mathematise what I mean yet, but I am thinking about it).
People that think about UDT/blackmail are thinking precisely about how to win in settings I am talking about.
Pick a side of this fence. Will AI resist running-in-circles trivially, or is its running in circles all that’s saving us from KILL ALL HUMANS objectives like you say in part b?
In the same way people’s minds do. They are inconsistent but will notice the setup very quickly and stop. (I don’t find Dutch book arguments very convincing, really).
Seems like a layer of inefficiency to have to resist temptation to run in circles rather than just want to go uphill.
There are two issues:
(a) In what settings do you want an architecture like that, and
(b) Ethics dictate we don’t just want to replace entities for the sake of efficiency even if they disagree. This leads to KILL ALL HUMANS. So, we might get an architecture like that due to how history played out. And then it’s just a brute fact.
I am guessing (a) has to do with “robustness” (I am not prepared to mathematise what I mean yet, but I am thinking about it).
People that think about UDT/blackmail are thinking precisely about how to win in settings I am talking about.
Pick a side of this fence. Will AI resist running-in-circles trivially, or is its running in circles all that’s saving us from KILL ALL HUMANS objectives like you say in part b?
If the latter, we are so utterly screwed.