Tobias H comments on All AGI Safety questions welcome (especially basic ones) [~monthly thread]

Tobias H 30 Jan 2023 16:14 UTC
1 point
0
- mruwnik 1 Feb 2023 10:49 UTC
  1 point
  0
  Parent
  Do you mean as a checkpoint/breaker? Or more in the sense of RLHF?
  The problem with these is that the limiting factor is human attention. You can’t have your best and brightest focusing full time on what the AI is outputting, as quality attention is a scarce resource. So you do some combination of the following:
  - slow down the AI to a level that is manageable (which will greatly limit its usefulness),
  - discard ideas that are too strange (which will greatly limit its usefulness)
  - limit its intelligence (which will greatly limit its usefulness)
  - don’t check so thoroughly (which will make it more dangerous)
  - use less clever checkers (which will greatly limit its usefulness and/or make it more dangerous)
  - check it reeeaaaaaaallllly carefully during testing and then hope for the best (which is reeeeaaaallly dangerous)
  You also need to be sure that it can’t outsmart the humans in the loop, which pretty much comes back to boxing it in.