The arguments Eliezer put forward do not clearly apply to Deep Learning
Yes but
and therefore we don’t have any positive reason to believe that alignment will be an issue in ML
does not follow.
The arguments Eliezer put forward never made sense in the first place
Yes (for some of the arguments), but again:
and therefore we do not have to worry about the alignment problem
does not follow.
The arguments Eliezer put forward captured a bunch of important things about the alignment problem but due to some differences in how we get to build ML systems we actually know of a promising route to aligning the systems
Yes—such as the various more neuroscience/DL inspired approaches (Byrnes, simboxes, shard theory, etc.), or others a bit harder to categorize like davdidad’s approach, or external empowerment.
But also I should point out that RLHF may work better for longer than most here anticipate, simply because if you distill the (curated) thoughts of mostly aligned humans you may just get mostly aligned agents.
I’m not sure if it’s worth us having more back-and-forth, so I’ll say my general feelings right now:
I think it’s of course healthy and fine to have a bunch of major disagreements with Eliezer
I would avoid building “hate” toward him or building resentment as those things are generally not healthy for people to cultivate in themselves toward people who have not done evil things, as I think it will probably cause them to make worse choices by their own judgment
By-default do not count on anyone doing the hard work of making another forum for serious discussion of this subject, especially one that’s so open to harsh criticism and has high standards for comments (I know LessWrong could be better in lots of ways but c’mon have you seen Reddit/Facebook/Twitter?)
There is definitely a bunch of space on this forum for people like yourself to develop different research proposals and find thoughtful collaborators and get input from smart people who care about the problem you’re trying to solve (I think Shard Theory is such an example here)
I wish you every luck in doing so and am happy to know if there are ways to further support you trying to solve the alignment problem (of course I have limits on my time/resources and how much I can help out different people)
I would avoid building “hate” toward him or building resentment as those things are generally not healthy for people to cultivate in themselves toward people who have not done evil things, as I think it will probably cause them to make worse choices by their own judgment
Of course—my use of the word hate here is merely in reporting impressions from other ML/DL forums and the schism between the communities.
I obviously generally agree with EY on many things, and to the extent I critique his positions here its simply a straightforward result of some people here assuming their correctness a priori.
Yes but
does not follow.
Yes (for some of the arguments), but again:
does not follow.
Yes—such as the various more neuroscience/DL inspired approaches (Byrnes, simboxes, shard theory, etc.), or others a bit harder to categorize like davdidad’s approach, or external empowerment.
But also I should point out that RLHF may work better for longer than most here anticipate, simply because if you distill the (curated) thoughts of mostly aligned humans you may just get mostly aligned agents.
Thanks!
I’m not sure if it’s worth us having more back-and-forth, so I’ll say my general feelings right now:
I think it’s of course healthy and fine to have a bunch of major disagreements with Eliezer
I would avoid building “hate” toward him or building resentment as those things are generally not healthy for people to cultivate in themselves toward people who have not done evil things, as I think it will probably cause them to make worse choices by their own judgment
By-default do not count on anyone doing the hard work of making another forum for serious discussion of this subject, especially one that’s so open to harsh criticism and has high standards for comments (I know LessWrong could be better in lots of ways but c’mon have you seen Reddit/Facebook/Twitter?)
There is definitely a bunch of space on this forum for people like yourself to develop different research proposals and find thoughtful collaborators and get input from smart people who care about the problem you’re trying to solve (I think Shard Theory is such an example here)
I wish you every luck in doing so and am happy to know if there are ways to further support you trying to solve the alignment problem (of course I have limits on my time/resources and how much I can help out different people)
Of course—my use of the word hate here is merely in reporting impressions from other ML/DL forums and the schism between the communities.
I obviously generally agree with EY on many things, and to the extent I critique his positions here its simply a straightforward result of some people here assuming their correctness a priori.
Okay! Good to know we concur on this. Was a bit worried, so thought I’d mention it.