the gears to ascension comments on My Alignment “Plan”: Avoid Strong Optimisation and Align Economy

the gears to ascension 1 Feb 2024 5:12 UTC
2 points
0
Right, and it would be easier to hack, since it has the same adversarial examples, right?

Oh, wait, I see what you’re saying. No I think hacking x-1 and x-2 will both be trivial. AIs are basically zero secure right now.
- VojtaKovarik 1 Feb 2024 19:25 UTC
  2 points
  0
  Parent
  I think the relative difficulty of hacking AI(x-1) and AI(x-2) will be sensitive to how much emphasis you put on the “distribute AI(x-1) quickly” part. IE, if you rush it, you might make it worse, even if AI(x-1) has the potential to be more secure. (Also, there is the “single point of failure” effect, though it seems unclear how large.)