I think most humans achieving what they currently consider their goals would end up being catastrophic for humanity, even if they succeed. (For example I think an eternal authoritarian regime is pretty catastrophic.)
I agree that an eternal authoritarian regime is pretty catastrophic.
I don’t think that a human in this scenario would be pursuing what they currently consider their goals—I think they would think more, learn more, and eventually settle on a different set of goals. (Maybe initially they pursue their current goals but it changes over time.) But it’s an open question to me whether the final set of goals they settle upon is actually reasonably aligned towards “humanity’s goals”—it may be or it may not be. So it could be catastrophic to amplify a current human in this way, from the perspective of humanity. But, it would not be catastrophic to the human that you amplified. (I think you disagree with the last statement, maybe I’m wrong about that.)
I’d say that it wouldn’t appear catastrophic to the amplified human, but might be catastrophic for that human anyway (e.g. if their values-on-reflection actually look a lot like humanity’s values-on-reflection, but they fail to achieve their values-on-reflection).
Yeah, I think that’s where we disagree. I think that humans are likely to achieve their values-on-reflection, I just don’t know what a human’s “values-on-reflection” would actually be (eg. could be that they want an authoritarian regime with them in charge).
It’s also possible that we have different concepts of values-on-reflection. Eg. maybe you mean that I have found my values-on-reflection only if I’ve cleared out all epistemic pits somehow and then thought for a long time with the explicit goal of figuring out what I value, whereas I would use a looser criterion. (I’m not sure what exactly.)
Yeah, what you described indeed matches my notion of “values-on-reflection” pretty well. So for example, I think a religious person’s values-on-reflection should include valuing logical consistency and coherent logical arguments (because they do implicitly care about those in their everyday lives, even if they explicitly deny it). This means their values-on-reflection should include having true beliefs, and thus be atheistic. But I also wouldn’t generally trust religious people to update away from religion if they reflected a bunch.
I think most humans achieving what they currently consider their goals would end up being catastrophic for humanity, even if they succeed. (For example I think an eternal authoritarian regime is pretty catastrophic.)
I agree that an eternal authoritarian regime is pretty catastrophic.
I don’t think that a human in this scenario would be pursuing what they currently consider their goals—I think they would think more, learn more, and eventually settle on a different set of goals. (Maybe initially they pursue their current goals but it changes over time.) But it’s an open question to me whether the final set of goals they settle upon is actually reasonably aligned towards “humanity’s goals”—it may be or it may not be. So it could be catastrophic to amplify a current human in this way, from the perspective of humanity. But, it would not be catastrophic to the human that you amplified. (I think you disagree with the last statement, maybe I’m wrong about that.)
I’d say that it wouldn’t appear catastrophic to the amplified human, but might be catastrophic for that human anyway (e.g. if their values-on-reflection actually look a lot like humanity’s values-on-reflection, but they fail to achieve their values-on-reflection).
Yeah, I think that’s where we disagree. I think that humans are likely to achieve their values-on-reflection, I just don’t know what a human’s “values-on-reflection” would actually be (eg. could be that they want an authoritarian regime with them in charge).
It’s also possible that we have different concepts of values-on-reflection. Eg. maybe you mean that I have found my values-on-reflection only if I’ve cleared out all epistemic pits somehow and then thought for a long time with the explicit goal of figuring out what I value, whereas I would use a looser criterion. (I’m not sure what exactly.)
Yeah, what you described indeed matches my notion of “values-on-reflection” pretty well. So for example, I think a religious person’s values-on-reflection should include valuing logical consistency and coherent logical arguments (because they do implicitly care about those in their everyday lives, even if they explicitly deny it). This means their values-on-reflection should include having true beliefs, and thus be atheistic. But I also wouldn’t generally trust religious people to update away from religion if they reflected a bunch.