This is the LessWrong account of Anonymous Omega (Forum profile). We are writing a series of posts critiquing prominent AI safety labs (Forum sequence here).
Omega.
Critiques of prominent AI safety organizations: Introduction
(cross-posted from EAF)
appreciate you sharing your impression of the post. It’s definitely valuable for us to understand how the post was received, and we’ll be reflecting on it for future write-ups.
1) We agree it’s worth taking into account aspects of an organization other than their output. Part of our skepticism towards Conjecture – and we should have made this more explicit in our original post (and will be updating it) – is the limited research track record of their staff, including their leadership. By contrast, even if we accept for the sake of argument that ARC has produced limited output, Paul Christiano has a clear track record of producing useful conceptual insights (e.g. Iterated Distillation and Amplification) as well as practical advances (e.g. Deep RL From Human Preferences) prior to starting work at ARC. We’re not aware of any equally significant advances from Connor or other key staff members at Conjecture; we’d be interested to hear if you have examples of their pre-Conjecture output you find impressive.
We’re not particularly impressed by Conjecture’s process, although it’s possible we’d change our mind if we knew more about it. Maintaining high velocity in research is certainly a useful component, but hardly sufficient. The Builder/Breaker method proposed by ARC feels closer to a complete methodology. But this doesn’t feel like the crux for us: if Conjecture copied ARC’s process entirely, we’d still be much more excited about ARC (per-capita). Research productivity is a product of a large number of factors, and explicit process is an important but far from decisive one.
In terms of the explicit comparison with ARC, we would like to note that ARC Theory’s team size is an order of magnitude smaller than Conjecture. Based on ARC’s recent hiring post, our understanding is the theory team consists of just three individuals: Paul Christiano, Mark Xu and Jacob Hilton. If ARC had a team ten times larger and had spent close to $10 mn, then we would indeed be disappointed if there were not more concrete wins.
2) Thanks for the concrete examples, this really helps tease apart our disagreement.
We are overall glad that the Simulators post was written. Our view is that it could have been much stronger had it been clearer which claims were empirically supported versus hypotheses. Continuing the comparison with ARC, we found ELK to be substantially clearer and a deeper insight. Admittedly ELK is one of the outputs people in the TAIS community are most excited by so this is a high bar.
The stuff on SVDs and sparse coding [...] was a valuable contribution. I’d still say it was less influential than e.g. toy models of superposition or causal scrubbing but neither of these were done by like 3 people in two weeks.
This sounds similar to our internal evaluation. We’re a bit confused by why “3 people in two weeks” is the relevant reference class. We’d argue the costs of Conjecture’s “misses” need to be accounted for, not just their “hits”. Redwood’s team size and budget are comparable to that of Conjecture, so if you think that causal scrubbing is more impressive than Conjecture’s other outputs, then it sounds like you agree with us that Redwood was more impressive than Conjecture (unless you think the Simulator’s post is head and shoulders above Redwood’s other output)?
Thanks for sharing the data point this influenced independent researchers. That’s useful to know, and updates us positively. Are you excited by those independent researchers’ new directions? Is there any output from those researchers you’d suggest we review?
3) We remain confident in our sources regarding Conjecture’s discussion with VCs, although it’s certainly conceivable that Conjecture was more open with some VCs than others. To clarify, we are not claiming that Connor or others at Conjecture did not mention anything about their alignment plans or interest in x-risk to VCs (indeed, this would be a barely tenable position for them given their public discussion of these plans), simply that their pitch gave the impression that Conjecture was primarily focused on developing products. It’s reasonable for you to be skeptical of this if your sources at Conjecture disagree; we would be interested to know how close to the negotiations those staff were, although understand this may not be something you can share.
4) We think your point is reasonable. We plan to reflect this recommendation and will reply here when we have an update.
5) This certainly depends on what “general industry” refers to: a research engineer at Conjecture might well be better for ML skill-building than, say, being a software engineer at Walmart. But we would expect ML teams at top tech companies, or working with relevant professors, to be significantly better for skill-building. Generally we expect quality of mentorship to be one of the most important components of individuals developing as researchers and engineers. The Conjecture team is stretched thin as a result of rapid scaling, and had few experienced researchers or engineers on staff in the first place. By contrast, ML teams at top tech companies will typically have a much higher fraction of senior researchers and engineers, and professors at leading universities comprise some of the best researchers in the field. We’d be curious to hear your case for Conjecture as skill building; without that it’s hard to identify where our main disagreement lies.
(cross-posted from the EA Forum)
Regarding your specific concerns about our recommendations:
1) We address this point in our response to Marius (5th paragraph)
2) As we note in the relevant section: “We think there is a reasonable risk that Connor and Conjecture’s outreach to policymakers and media is alarmist and may decrease the credibility of x-risk.” This kind of relationship-building is unilateralist when it can decrease goodwill amongst policymakers.
3) To be clear, we do not expect Conjecture to have the same level of “organizational responsibility” or “organizational competence” (we aren’t sure what you mean by those phrases and don’t use them ourselves) as OpenAI or Anthropic. Our recommendation was for Conjecture to have a robust corporate governance structure. For example, they could change their corporate charter to implement a “springing governance” structure such that voting equity (but not political equity) shift to an independent board once they cross a certain valuation threshold. As we note in another reply, Conjecture’s infohazard policy has no legal force, and therefore is not as strong as either OpenAI or Anthropic’s corporate governance models. As we’ve noted already, we have concerns about both OpenAI and Anthropic despite having these models in place: Conjecture doesn’t even have those, which makes us more concerned.
Hi Erik, thanks for your points, we meant to say “at the same level of expertise as alignment leaders and researchers other organizations such as...”. This was a typo on our part.
(crossposted from the EA Forum)
We appreciate your detailed reply outlining your concerns with the post.
Our understanding is that your key concern is that we are judging Conjecture based on their current output, whereas since they are pursuing a hits-based strategy we should expect in the median case for them to not have impressive output. In general, we are excited by hits-based approaches, but we echo Rohin’s point: how are we meant to evaluate organizations if not by their output? It seems healthy to give promising researchers sufficient runway to explore, but $10 million dollars and a team of twenty seems on the higher end of what we would want to see supported purely on the basis of speculation. What would you suggest as the threshold where we should start to expect to see results from organizations?
We are unsure where else you disagree with our evaluation of their output. If we understand correctly, you agree that their existing output has not been that impressive, but think that it is positive they were willing to share preliminary findings and that we have too high a bar for evaluating such output. We’ve generally not found their preliminary findings to significantly update our views, whereas we would for example be excited by rigorous negative results that save future researchers from going down dead-ends. However, if you’ve found engaging with their output to be useful to your research then we’d certainly take that as a positive update.
Your second key concern is that we provide limited evidence for our claims regarding the VCs investing in Conjecture. Unfortunately for confidentiality reasons we are limited in what information we can disclose: it’s reasonable if you wish to consequently discount this view. As Rohin said, it is normal for VCs to be profit-seeking. We do not mean to imply these VCs are unusually bad for VCs, just that their primary focus will be the profitability of Conjecture, not safety impact. For example, Nat Friedman has expressed skepticism of safety (e.g. this Tweet) and is a strong open-source advocate, which seems at odds with Conjecture’s info-hazard policy.
We have heard from multiple sources that Conjecture has pitched VCs on a significantly more product-focused vision than they are pitching EAs. These sources have either spoken directly to VCs, or have spoken to Conjecture leadership who were part of negotiation with VCs. Given this, we are fairly confident on the point that Conjecture is representing themselves differently to separate groups.
We believe your third key concern is our recommendations are over-confident. We agree there is some uncertainty, but think it is important to make actionable recommendations, and based on the information we have our sincerely held belief is that most individuals should not work at Conjecture. We would certainly encourage individuals to consider alternative perspectives (including expressed in this comment) and to ultimately make up their own mind rather than deferring, especially to an anonymous group of individuals!
Separately, I think we might consider the opportunity cost of working at Conjecture higher than you. In particular, we’d generally evaluate skill-building routes fairly highly: for example, being a research assistant or PhD student in academia, or working in an ML engineering position in an applied team at a major tech company. These are generally close to capabilities-neutral, and can make individuals vastly more productive. Given the limited information on CogEm it’s hard to assess whether it will or won’t work, but we think there’s ample evidence that there are better places to develop skills than Conjecture.
We wholeheartedly agree that it is important to maintain high epistemic standards during the critique. We have tried hard to differentiate between well-established facts, our observations from sources, and our opinion formed from those. For example, the About Conjecture section focuses on facts; the Criticisms and Suggestions section includes our observations and opinions; and Our Views on Conjecture are more strongly focused on our opinions. We’d welcome feedback on any areas where you feel we over-claimed.
Thanks for commenting and sharing your reactions Mishka.
Some quick notes on what you’ve shared:
Although one has to note that their https://www.conjecture.dev/a-standing-offer-for-public-discussions-on-ai/ is returning a 404 at the moment. Is that offer still standing?
In their response to us they told us this offer was still standing.
A lot of upvotes on such a post without substantial comments seems… unfair?
As of the time of your comment, we believe there were about 8 votes and 30 karma and the post had been up a few hours. We are not sure what voting frequency is on LW (e.g. we’re not sure if this is higher or lower than average?) but if it’s higher, some hypotheses (we’d love to hear inputs from folks who have upvoted without a comment):
Some people are supportive of criticism in general, and may have upvoted to support more critical discussion (even though they may disagree with object level comments)
Some people who upvoted may already agree with the views of this post (e.g. some of the upvoters could be our reviewers)
Some people may have upvoted so this post gets more attention / discussion so they could see what others think of it
Some folks may have upvoted for now and might come back to the post to leave more substantive comments when they have time
I think what makes writing comments on posts like this one difficult is that the post is really structured and phrased in such a way as to make this a situation of personal conflict, internal to the relatively narrow AI safety community.
I have not downvoted the post, but I don’t like this aspect, I am not sure this is the right way to approach things...If understanding correctly, we think what you’re saying is that because there are many claims in this post, it seems suboptimal that people can’t indicate that via post-level voting.
We think this is a great point. We’d love to see an option for people to agree/disagree with specific claims on posts to provide a more nuanced understanding of where consensus lies. We think it’s very plausible that some of our points will end up being much more controversial than others. (if you wanted to add separate comments for specific claims that people could vote on, we’d love to see that and would be happy to add a note to the top-level post encouraging folks to do so)
Our hope is that folks can comment with areas of disagreement to start a discussion on those points.
Hi TurnTrout, thanks for asking this question. We’re happy to clarify:
‘experts’: We do not consider Conjecture at the same level of expertise as [edit] alignment leaders and researchers at other organizations such as Redwood, ARC, researchers at academic labs like CHAI, and the alignment teams at Anthropic, OpenAI and DeepMind. This is primarily because we believe their research quality is low.
‘with stature in the AIS community’: Based on our impression (from conversations with many senior TAIS researchers at a range of organizations, including a handful who reviewed this post and didn’t disagree with this point) of the TAIS community, Conjecture is not considered a top alignment research organization within the community.
Critiques of prominent AI safety labs: Conjecture
Quick updates:
Our next critique (on Conjecture) will be published in 10 days.
The critqiue after that will be on Anthropic. If you’d like to be a reviewer, or have critiques you’d like to share, please message us or email anonymouseaomega@gmail.com.
If you’d like to help edit our posts (incl. copy-editing—basic grammar etc, but also tone & structure suggestions and fact-checking/steel-manning), please email us!
We’d like to improve the pace of our publishing and think this is an area that external perspectives could help us
Make sure our content & tone is neutral & fair
Save us time so we can focus more on research and data gathering
Omega.’s Shortform
We’ve crossposted the full text on LessWrong here: https://www.lesswrong.com/posts/SuZ6Guuos7CjfwRQb/critiques-of-prominent-ai-safety-labs-redwood-research
Note that we don’t criticize Connor specifically, but rather the lack of a senior technical expert on the team in general (including Connor). Our primary criticisms of Connor don’t have to do with his leadership skills (which we don’t comment on this at any point in the post).