Anon User comments on Evidence on recursive self-improvement from current ML

Anon User 31 Dec 2022 23:44 UTC
5 points
0
Nitpick: the article seems to suggest that if RSI is possible, then strong takeoff is inevitable, and boxing would not work—but isn’t boxing a potential approach for slowing down the RSI (e.g. each iteration of RSI is only executed once unboxed by a human—at least until/unless boxing fails), and therefore might still work?
- beren 1 Jan 2023 13:40 UTC
  3 points
  0
  Parent
  Yes, this is the few-shot alignment world described in the post. I agree that in principle if boxing could completely halt RSI then that would be fantastic but that especially with each iteration of RSI there is some probability that the box will fail and we would then get unbounded RSI. This means we would get effectively a few ‘shots’ to align our boxed AGI before we die.