Ape in the coat comments on You can, in fact, bamboozle an unaligned AI into sparing your life

Ape in the coat 30 Sep 2024 7:54 UTC
14 points
2
All such proposal work according to this scheme:
1. Humans are confused about anthropic reasoning
2. In our confusion we assume that something is a reasonable thing to do
3. We conclude that AI will also be confused about anthropic reasoning in exactly the same way by default and therefore come to the same conclusion.
Trying to speculate on your own ignorance and confusion is not a systematic way of building accurate map territory relations. We should in fact stop doing it, no matter how pleasant the wishful thinking is.
My default hypothesis is that AI won’t be even bothered by all the simulation arguments that are mindboggling to us. And we would have specifically design AI to be muggable this way. Which would also introduce a huge flaw in the AI’s reasoning ability, exploitable in other ways, most of which will lead to horrible consequences.
- Mitchell_Porter 30 Sep 2024 9:17 UTC
  9 points
  2
  Parent
  
  My default hypothesis is that AI won’t be even bothered by all the simulation arguments that are mindboggling to us.
  
  I have similar thoughts, though perhaps for a different reason. There are all these ideas about acausal trade, acausal blackmail, multiverse superintelligences shaping the “universal prior”, and so on, which have a lot of currency here. They have some speculative value; they would have even more value as reminders of the unknown, and the conceptual novelties that might be part of a transhuman intelligence’s worldview; but instead they are elaborated in greatly varied (and yet, IMO, ill-founded) ways, by people for whom this is the way to think about superintelligence and the larger reality.
  
  It reminds me of the pre-2012 situation in particle physics, in which it was correctly anticipated that the Higgs boson exists, but was also incorrectly expected that it would be accompanied by other new particles and a new symmetry, involved in stabilizing its mass. Thousands, maybe tens of thousands of papers were produced, proposing specific detectable new symmetries and particles that could provide this mechanism. Instead only the Higgs has shown up, and people are mostly in search of a different mechanism.
  
  The analogy for AI would be: important but more straightforward topics have been neglected in favor of these fashionable possibilities, and, when reality does reveal a genuinely new aspect, it may be something quite different to what is being anticipated here.
- ryan_greenblatt 30 Sep 2024 15:49 UTC
  2 points
  0
  Parent
  This proposal doesn’t depend on mugging the AI. The proposal actually gets the AI more resources in expectation due to a trade.
  
  I agree the post is a bit confusing and unclear about this. (And the proposal under “Can we get more than this” is wrong. At a minimum, such AIs will also be mugged by everyone else too meaning you get get huge amounts of extra money for basically free.)
  - Ape in the coat 2 Oct 2024 15:05 UTC
    2 points
    0
    Parent
    This doesn’t seem as a fair trade proposal to me. This is a bet where one side has disproportional amount of information and uses it to its own benefit.
    Suppose I tossed a fair coin, looked on the outcome and proposed you to bet on Heads with 99:1 odds. Is it reasonable for you to agree?