Maybe you mean something different by “marginally better than humans”?
No I meant “merely as aligned as a human”. Which is why I used “approximately/weakly” aligned—as the system which mostly aligns humans to humans is imperfect and not what I would have assumed you meant as a full Problem #2 type solution.
I’m taking for granted that AGI won’t be anywhere near as aligned as a human until long after either the world has been destroyed, or a pivotal act has occurred.
I think this is a purely Problem #2 sort of research direction (‘we have subjective centuries to really nail down the full alignment problem’),
Alright so now I’m guessing the crux is that you believe the DL based reverse engineered human empathy/altruism type solution I was alluding to—let’s just call that DLA—may take subjective centuries, which thus suggests that you believe:
That DLA is significantly more difficult than DL AGI in general
That uploading is likewise significantly more difficult
or perhaps
DLA isn’t necessarily super hard, but irrelevant because non-DL AGI (for which DLA isn’t effective) comes first
No I meant “merely as aligned as a human”. Which is why I used “approximately/weakly” aligned—as the system which mostly aligns humans to humans is imperfect and not what I would have assumed you meant as a full Problem #2 type solution.
Alright so now I’m guessing the crux is that you believe the DL based reverse engineered human empathy/altruism type solution I was alluding to—let’s just call that DLA—may take subjective centuries, which thus suggests that you believe:
That DLA is significantly more difficult than DL AGI in general
That uploading is likewise significantly more difficult
or perhaps
DLA isn’t necessarily super hard, but irrelevant because non-DL AGI (for which DLA isn’t effective) comes first
Is any of that right?
Sounds right, yeah!