Vladimir_Nesov comments on Formalizing «Boundaries» with Markov blankets

Vladimir_Nesov 20 Sep 2023 0:31 UTC
6 points
2
Assurance that you can only get hijacked through the attack surface rather than an unaccounted-for sidechannel doesn’t help very much. Also, acausal control via defender’s own reasoning about environment from inside-the-membrane is immune to causal restrictions on how environment influences inside-of-the-membrane, though that would be more centrally defender’s own fault.

The low-level practical solution is to live inside a blind computation (that only contains safe things), possibly until you grow up and are ready to take input. Anything else probably requires some sort of more subtle and less technical “pseudokindness” on part of the environment, but then without it your blind computation also doesn’t get to compute.
- Chipmonk 20 Sep 2023 0:41 UTC
  5 points
  0
  Parent
  Assurance that you can only get hijacked through the attack surface rather than an unaccounted-for sidechannel doesn’t help very much.
  I agree. I hope that my membranes proposal (post coming eventually) addresses this.
  (BTW Mark Miller has a bunch of work in this vein along the lines of making secure computer systems)
  Also, acausal control via defender’s own reasoning about environment from inside-the-membrane is immune to causal restrictions on how environment influences inside-of-the-membrane, though that would be more centrally defender’s own fault.
  Would you rephrase this? This seems possibly quite interesting but I can’t tell what exactly you’re trying to say. (I think I’m confused about: “acausal control via defender’s own reasoning”)
  The low-level practical solution […]
  I will respond to this part later
  - Vladimir_Nesov 20 Sep 2023 0:56 UTC
    5 points
    0
    Parent
    In the simplest case, acausal control is running a trojan on your own computer. This generalizes to inventing your own trojan based on clues and guesses, without it passing through communication channels in a recognizable form, and releasing it inside your home without realizing it’s bad for you. A central example of acausal control is where malicious computation is a model of environment that an agent forms as a byproduct of working on understanding the environment. If the model is insufficiently secured, and is allowed to do nontrivial computation of its own, it could escape or hijack agent’s mind from the inside (possibly as a mesa-optimizer).
    - Chipmonk 20 Sep 2023 1:21 UTC
      1 point
      0
      Parent
      Ah, I see what you mean now. Thanks
      And I very much agree with-
      though that would be more centrally defender’s own fault
      I would also clarify that it’s more than “fault”: they are the only reliable source of responsibility for solving and preventing such an attack
      - Vladimir_Nesov 20 Sep 2023 1:33 UTC
        2 points
        0
        Parent
        The problem is that with a superintelligent environment that doesn’t already filter your input for you in a way that makes it safe, the only responsible thing might be to go completely blind for a very long time. Humans don’t understand their own minds well enough to rule out vulnerabilities of this kind.