We are putting together Orthogonal, a non-profit alignment research organization focused on agent foundations, based in Europe.
We are pursuing the formal alignment flavor of agent foundations in order to solve alignment in a manner which would scale to superintelligence in order to robustly overcome AI risk. If we can afford to, we also intend to hire agent foundations researchers which, while not directly aimed at such an agenda, produce output which is likely to be instrumental to it, such as finding useful “true names”.
Within this framework, our foremost agenda for the moment is QACI, and we expect to make significant progress on ambitious alignment within short timelines (months to years) and produce a bunch of dignity in the face of high existential risk.
Our goal is to be the kind of object-level research which cyborgism would want to accelerate. And when other AI organizations attempt to “buy time” by restraining their AI systems, we intend to be the research that this time is being bought for.
We intend to exercise significant caution with regards to AI capability exfohazards: Conjecture’s policy document offers a sensible precedent for handling matters of internal sharing, and locked posts are a reasonable default for publishing our content to the outside. Furthermore, we would like to communicate about research and strategy with MIRI, whose model of AI risk we largely share and who we percieve to have the most experience with non-independent agent foundations research.
Including myself — Tamsin Leake, founder of Orthogonal and LTFF-funded AI alignment researcher — we have several promising researchers intending to work fulltime, and several more who are considering that option. I expect that we will find more researchers excited to join our efforts in solving ambitious alignment.
If you are interested in such a position, we encourage you to get acquainted with our research agenda — provided we get adequate funding, we hope to run a fellowship where people who have demonstrated interest in this research can work alongside us in order to test their fit as a fellow researcher at Orthogonal.
We might also be interested in people who could help us with engineering, management, and operations. And, in order to make all of that happen, we are looking for funding. For these matters or any other inquiries, you can get in touch with us at contact@orxl.org.
It saddens me a bit to see so little LW commenter engagement (both with this post, and with Orthogonal’s research agenda in general. I think this is because they’re the sort of posts that feel like you’re supposed to be engaging with the object-level content in a sophisticated way, and the barrier to entry for that is quite high. Without having much to add about the object-level content of the research agenda, I would like to say: This flavor of research is something the world desperately needs more of. Carado’s earlier posts and I few conversations I’ve had left me with a positive impression, and I believe those posts about it are a sufficient work product to prove they’re worth funding.
I initially dismissed Orthogonal due to a guess that their worldview was too similar to MIRI’s, and that they would give up or reach a dead end for reasons similar to why MIRI hasn’t made much progress.
Then the gears to ascension prodded me to take a closer look.
Now that I’ve read their more important posts, I’m more confused.
I still think Orthogonal has a pretty low chance of making a difference, but there’s enough that’s unique about their ideas to be worth pursuing. I’ve donated $15k to Orthogonal.
Hell yes. It feels great to hear someone say what I’ve been implicitly thinking.
The LessWrong Review runs every year to select the posts that have most stood the test of time. This post is not yet eligible for review, but will be at the end of 2024. The top fifty or so posts are featured prominently on the site throughout the year.
Hopefully, the review is better than karma at judging enduring value. If we have accurate prediction markets on the review results, maybe we can have better incentives on LessWrong today. Will this post make the top fifty?