Vladimir_Nesov comments on A Data limited future

Vladimir_Nesov 6 Aug 2022 18:00 UTC
LW: 3 AF: 2
0
AF
An upload (an exact imitation of a human) is the most straightforward way of securing time for alignment research, except it’s not plausible in our world for uploads to be developed before AGIs. The plausible similar thing is more capable language/multimodal models, steeped in human culture, where alignment guarantees at least a priori look very dubious. And an upload probably needs to be value-laden to be efficient enough to give an advantage, while remaining exact in morally relevant ways, though there’s a glimmer of hope generalization can capture this without a need to explicitly set up a fixpoint through extrapolated values. Doing the same with Tool AIs or something is only slightly less speculative than directly developing aligned AGIs without that miracle, so the advantage of an upload is massive.
What links here?
- Vladimir_Nesov's comment on the Insulated Goal-Program idea by Tamsin Leake (13 Aug 2022 14:04 UTC; 6 points)
- Donald Hobson 9 Aug 2022 19:13 UTC
  LW: 2 AF: 1
  0
  AF Parent
  Assuming of course that the first upload/(sufficiently humanlike model ) is developed by someone actually trying to do this.