Raemon comments on Solving Math Problems by Relay

Raemon 18 Jul 2020 18:56 UTC
4 points
I share habryka’s concerns re: “unaligned with yourself”, but, I think I was missing (or had forgotten) that part of the idea here was you’re using.… an uploaded clone of yourself, so you’re at least more likely to be aligned with yourself even if when scaled up you’re not aligned with anyone else.
- ESRogs 18 Jul 2020 19:11 UTC
  2 points
  Parent
  an uploaded clone of yourself
  Not sure if you were just being poetic, but FWIW I believe the idea (in HCH for example), is to use an ML system trained to produce the same answers that a human would produce, which is not strictly-speaking an upload (unless the only way to imitate is actually to simulate in detail, s.t. the ML system ends up growing an upload inside it?).
  - Raemon 18 Jul 2020 19:19 UTC
    4 points
    Parent
    Is it “a human” or “you specifically?”
    If it’s “a human”, I’m back to “humans are unfriendly by default” territory.
    [Edit: But, I had in fact also not been tracking that it’s not a strict upload, it’s a trained on human actions. I think I recall reading that earlier but had forgotten. I did leave the ”...” in my summary because I wasn’t quite sure if upload was the right word though. That all said, being merely trained on human actions, whether mine or someone else’s, I think makes it even more likely to be unfriendly than an upload]
    - Pongo 19 Jul 2020 1:25 UTC
      1 point
      Parent
      To get sufficient training data, it must surely be “a human” (in generic, smushed together, ‘modelling an ensemble of humans’ sense)