To be honest, the fact that Eliezer is being his blunt unfiltered self is why I’d like to go to him first if he offered to evaluate my impact plan re AI. Because he’s so obviously not optimising for professionalism, impressiveness, status, etc. he’s deconfounding his signal and I’m much better able to evaluate what he’s optimising for.[1] Hence why I’m much more confident that he’s actually just optimising for roughly the thing I’m also optimising for. I don’t trust anyone who isn’t optimising purely to be able to look at my plan and think “oh ok, despite being a nobody this guy has some good ideas” if that were true.
And then there’s the Graham’s Design Paradox thing. I think I’m unusually good at optimising purely, and I don’t think people who aren’t around my level or above would be able to recognise that. Obviously, he’s not the only one, but I’ve read his output the most, so I’m more confident that he’s at least one of them.
Yes, perhaps a consequentialist would be instrumentally motivated to try to optimise more for these things, but the fact that Eliezer doesn’t do that (as much) just makes it easier to understand and evaluate him.
To be honest, the fact that Eliezer is being his blunt unfiltered self is why I’d like to go to him first if he offered to evaluate my impact plan re AI. Because he’s so obviously not optimising for professionalism, impressiveness, status, etc. he’s deconfounding his signal and I’m much better able to evaluate what he’s optimising for.[1] Hence why I’m much more confident that he’s actually just optimising for roughly the thing I’m also optimising for. I don’t trust anyone who isn’t optimising purely to be able to look at my plan and think “oh ok, despite being a nobody this guy has some good ideas” if that were true.
And then there’s the Graham’s Design Paradox thing. I think I’m unusually good at optimising purely, and I don’t think people who aren’t around my level or above would be able to recognise that. Obviously, he’s not the only one, but I’ve read his output the most, so I’m more confident that he’s at least one of them.
Yes, perhaps a consequentialist would be instrumentally motivated to try to optimise more for these things, but the fact that Eliezer doesn’t do that (as much) just makes it easier to understand and evaluate him.