cwillu comments on Let’s See You Write That Corrigibility Tag

cwillu 19 Jun 2022 22:38 UTC
1 point
0
I assume “If you’ve somehow figured out how to do a pivotal act” is intended to limit scope, but doesn’t that smuggle the hardness of the Hard Task™ out of the equation?
Every question I ask myself how this approach would address the a given issue, I find myself having to defer to the definition of the pivotal act, which is the thing that’s been defined as out of scope.
- Quintin Pope 19 Jun 2022 22:46 UTC
  2 points
  0
  Parent
  You need at least a certain amount of transfer in order to actually do your pivotal act. An “AI” with literally zero transfer is just a lookup table. The point of this principle is that you want as little transfer as possible while still managing a pivotal act. I used a theorem proving AI as an example where it’s really easy to see what would count as unnecessary transfer. But even with something whose pivotal act would require a lot more transfer than a theorem prover (say, a nanosystem builder AI), you’d still want to avoid transfer to domains such as deceiving humans or training other AIs.