Paul’s work (ELK more than RLHF though it was useful to see what happens when you throw RL at LLMs in a way that’s kind of similar to how I do get some value out of Chris’s work)
Eliezer’s work
Nate’s work
Holden’s writing on cold takes
Ajeya’s work
Wentworth’s work
The debate stuff
Redwood’s work
Bostrom’s work
Evan’s work
Scott and Abram’s work
There is of course still huge variance in how relevant and how much for the throat these different people’s work is going for, but all of these seem more relevant to AI Alignment/AI-not-kill-everyonism than Chris’s work (which again, I found interesting, but not like super interesting).
Lots of people’s work:
Paul’s work (ELK more than RLHF though it was useful to see what happens when you throw RL at LLMs in a way that’s kind of similar to how I do get some value out of Chris’s work)
Eliezer’s work
Nate’s work
Holden’s writing on cold takes
Ajeya’s work
Wentworth’s work
The debate stuff
Redwood’s work
Bostrom’s work
Evan’s work
Scott and Abram’s work
There is of course still huge variance in how relevant and how much for the throat these different people’s work is going for, but all of these seem more relevant to AI Alignment/AI-not-kill-everyonism than Chris’s work (which again, I found interesting, but not like super interesting).
Do you mean Evan Hubinger, Evan R. Murphy, or a different Evan? (I would be surprised and humbled if it was me, though my priors on that are low.)
Hubinger