trevor comments on [$20K in Prizes] AI Safety Arguments Competition

trevor 27 May 2022 3:32 UTC
1 point
“If you have an untrustworthy general superintelligence generating [sentences] meant to [prove something], then I would not only expect the superintelligence to be [smart enough] to fool humans in the sense of arguing for things that were [actually lies]… I’d expect the superintelligence to be able to covertly hack the human [mind] in ways that I wouldn’t understand, even after having been told what happened[, because a superintelligence is, by definition, at least as smart to humans as humans are to chimpanzees]. So you must have some belief about the superintelligence being aligned before you dared to look at [any sentences it generates].
EY
- trevor 27 May 2022 3:34 UTC
  1 point
  Parent
  Techxecutives