I’ve been thinking hard about what my next step should be, after my job applications being turned down again by various safety orgs and Anthropic.
Now it seems clear to me. I have a vision of how I expect an RSI process to start, using LLMs to mine testable hypotheses from existing published papers.
I should just put my money where my mouth is, and try to build the scaffolding for this. I can then share my attempts with someone at Anthropic. If I’m wrong, I will be wasting my time and savings. If I’m right, I might be substantially helping the world. Seems like a reasonable bet.
I can then share my attempts with someone at Anthropic.
Alternately, collaborating/sharing with e.g. METR or UK AISI auto ML evals teams might be interesting. Maybe even Pallisade or similar orgs from a ‘scary demo’ perspective? @jacquesthibs might also be interested. I might also get to work on this or something related, depending on how some applications go.
I also expect Sakana, Jeff Clune’s group and some parts of the open-source ML community will try to push this, but I’m more uncertain at least in some of these cases about the various differential acceleration tradeoffs.
Related: https://www.lesswrong.com/posts/fdCaCDfstHxyPmB9h/vladimir_nesov-s-shortform?commentId=2ZRSnZEQDbWzsZA3M
https://www.lesswrong.com/posts/MEBcfgjPN2WZ84rFL/o-o-s-shortform?commentId=QDEvi8vQkbTANCw2k
I’ve been thinking hard about what my next step should be, after my job applications being turned down again by various safety orgs and Anthropic. Now it seems clear to me. I have a vision of how I expect an RSI process to start, using LLMs to mine testable hypotheses from existing published papers.
I should just put my money where my mouth is, and try to build the scaffolding for this. I can then share my attempts with someone at Anthropic. If I’m wrong, I will be wasting my time and savings. If I’m right, I might be substantially helping the world. Seems like a reasonable bet.
Alternately, collaborating/sharing with e.g. METR or UK AISI auto ML evals teams might be interesting. Maybe even Pallisade or similar orgs from a ‘scary demo’ perspective? @jacquesthibs might also be interested. I might also get to work on this or something related, depending on how some applications go.
I also expect Sakana, Jeff Clune’s group and some parts of the open-source ML community will try to push this, but I’m more uncertain at least in some of these cases about the various differential acceleration tradeoffs.