jacob_cannell comments on Some abstract, non-technical reasons to be non-maximally-pessimistic about AI alignment

jacob_cannell 14 Dec 2021 17:53 UTC
2 points
Maybe you mean something different by “marginally better than humans”?
No I meant “merely as aligned as a human”. Which is why I used “approximately/weakly” aligned—as the system which mostly aligns humans to humans is imperfect and not what I would have assumed you meant as a full Problem #2 type solution.
I’m taking for granted that AGI won’t be anywhere near as aligned as a human until long after either the world has been destroyed, or a pivotal act has occurred.
I think this is a purely Problem #2 sort of research direction (‘we have subjective centuries to really nail down the full alignment problem’),
Alright so now I’m guessing the crux is that you believe the DL based reverse engineered human empathy/altruism type solution I was alluding to—let’s just call that DLA—may take subjective centuries, which thus suggests that you believe:
- That DLA is significantly more difficult than DL AGI in general
- That uploading is likewise significantly more difficult
or perhaps
- DLA isn’t necessarily super hard, but irrelevant because non-DL AGI (for which DLA isn’t effective) comes first
Is any of that right?
- Rob Bensinger 16 Dec 2021 7:02 UTC
  2 points
  Parent
  Sounds right, yeah!