I’m glad to hear that! I often don’t hear much response to my essays so it’s good to know you’ve read some of them :)
MichaelDickens
I don’t have a mistakes page but last year I wrote a one-off post of things I’ve changed my mind on.
I have a few potential criticisms of this paper. I think my criticisms are probably wrong and the paper’s conclusion is right, but I’ll just put them out there:
Nearly half the tasks in the benchmark take 1 to 30 seconds (the ones from the SWAA set). According to the fitted task time <> P(success) curve, most tested LLMs should be able to complete those with high probability, so they don’t provide much independent signal.
However, I expect task time <> P(success) curve would look largely the same if you excluded the SWAA tasks.
SWAA tasks take humans 1 to 30 seconds and HCAST tasks take 1 minute to 30 hours. The two different sets are non-overlapping. If HCAST tasks are harder than SWAA tasks for LLMs, then a regression will indicate that LLMs are getting better at longer tasks when really they’re just getting better at HCAST tasks.
I think this criticism is wrong—if it were true, the across-dataset correlation between time and LLM-difficulty should be higher than the within-dataset correlation, but from eyeballing Figure 4 (page 10), it looks like it’s not higher (or at least not much).
The benchmark tasks could have a bias where longer tasks are more difficult in general (not just because they’re longer). I haven’t looked through all the HCAST tasks (in fact I couldn’t find where they were listed) but Figure 16 on page 29 shows that humans had lower success rates on longer tasks. As example tasks, the paper gives, among others, “Research simple factual information from Wikipedia” = 1 minute and “Write a Python script to transform JSON data” = 56 minutes (page 6). I think a more comparable 56-minute task would be something like “find some factual information that’s buried in a long article”, which I believe even a GPT-3-eara LLM would perform well on.
I don’t know enough about the tasks to know whether this criticism is correct. My uneducated guess is that there’s a true positive relationship between task length and (non-length-related-)task difficulty, but that if you adjusted for this, you’d still see an exponential trend in task time <> P(success), and the curve would just be dampened a bit.
The authors also suspect that longer tasks might be more difficult, and “[i]f this is the case, we may be underestimating the pace of model improvement.” I think it would mean we’re underestimating the pace of improvement on hard tasks, while simultaneously overestimating the pace of improvement on long tasks.
Why do you think this narrows the distribution?
I can see an argument for why, tell me if this is what you’re thinking–
The biggest reason why LLM paradigm might never reach AI takeoff is that LLMs can only complete short-term tasks, and can’t maintain coherence over longer time scales (e.g. if an LLM writes something long, it will often start contradicting itself). And intuitively it seems that scaling up LLMs hasn’t fixed this problem. However, this paper shows that LLMs have been getting better at longer-term tasks, so LLMs probably will scale to AGI.
A few miscellaneous thoughts:
I agree with Dagon that the most straightforward solution is simply to sell your equity as soon as it vests. If you don’t do anything else then I think at least you should do that—it’s a good idea just on the basis of diversification, not even considering conflicts of interest.
I think you should be willing to take quite a large loss to divest. In a blog post, I estimated that for an investor with normal-ish risk aversion, it’s worth paying ~4% per year to avoid the concentration risk of holding a single mega-cap stock (so you’re willing to pay more to get rid of stock that vests later). Then add the conflict-of-interest factor on top of that. How much COI matters depends on how much influence you have over Google’s AI policy and how much you think you’ll be swayed by monetary incentives. My guess is the COI factor matters less than 4% per year but that’s not based on anything concrete.
“[Google’s Insider Trading Policy] describes company-wide policies that address the risks of insider trading, such as a prohibition on any Google employee hedging Google stock”: I read this as saying you can’t hedge Google stock based on insider information, not that you can’t hedge it at all. But I don’t know what the law says about hedging stock in your employer.
This is the belief of basically everyone running a major AGI lab. Obviously all but one of them must be mistaken, but it’s natural that they would all share the same delusion.
I agree with this description and I don’t think this is sane behavior.
Actions speak louder than words, and their actions are far less sane than these words.
For example, if Demis regularly lies awake at night worrying about how the thing he’s building could kill everyone, why is he still putting so much more effort into building it than into making it safe?
I was familiar enough to recognize that it was an edit of something I had seen before, but not familiar enough to remember what the original was
I’m really not convinced that public markets do reliably move in the predictable (downward) direction in response to “bad news” (wars, coups, pandemics, etc).
Also, market movements are hard to detect. How much would Trump violating a court order decrease the total (time-discounted) future value of the US economy? Probably less than 5%? And what is the probability that he violates a court order? Maybe 40%? So the market should move <2%, and evidence about this potential event so far has come in slowly instead of at a single dramatic moment so this <2% drop could have been spread over multiple weeks.
If I’m allowed to psychoanalyze funders rather than discussing anything at the object level, I’d speculate that funders like evals because:
If you funded the creation of an eval, you can point to a concrete thing you did. Compare to funding theoretical technical research, which has a high chance of producing no tangible outputs; or funding policy work, which has a high chance of not resulting in any policy change. (Streetlight Effect.)
AI companies like evals, and funders seem to like doing things AI companies like, for various reasons including (a) the thing you funded will get used (by the AI companies) and (b) you get to stay friends with the AI companies.
I don’t know what the regulatory plan is, I was just referring to this poll, which I didn’t read in full, I just read the title. Reading it now, it’s not so much a plan as a vision, and it’s not so much “Musk’s vision” as it is a viewpoint (that the poll claims is associated with Musk) in favor of regulating the risks of AI. Which is very different from JD Vance’s position; Vance’s position is closer to the one that does not poll well.
I guess I’m expressing doubt about the viability of wise or cautious AI strategies, given our new e/acc world order, in which everyone who can, is sprinting uninhibitedly towards superintelligence.
e/acc does not poll well and there is widespread popular support for regulating AI (see AIPI polls). If the current government favors minimal regulations, that’s evidence that an AI safety candidate is more likely to succeed, not less.
(Although I’m not sure that follows because I think the non-notkilleveryonism variety of AI safety is more popular. Also Musk’s regulatory plan is polling well and I’m not sure if it differs from e.g. Vance’s plan.)
If you publicly commit to something, taking down the written text does not constitute a de-commitment. Violating a prior commitment is unethical regardless of whether the text of the commitment is still on your website.
(Not that there’s any mechanism to hold Google to its commitments, or that these commitments ever meant anything—Google was always going to do whatever it wanted anyway.)
Claude 3.5 Sonnet is a mid-sized model that cost a few $10M’s to train
I don’t get this, if frontier(ish) models cost $10M–$100M, why is Nvidia’s projected revenue more like $1T–$10T? Is the market projecting 100,000x growth in spending on frontier models within the next few years? I would have guessed more like 100x–1000x growth but at least one of my numbers must be wrong. (Or maybe they’re all wrong by ~1 OOM?)
Retroactive If-Then Commitments
This is actually a crazy big effect size? Preventing ~10–50% of a cold for taking a few pills a day seems like a great deal to me.
Don’t push the frontier of regulations. Obviously this is basically saying that Anthropic should stop making money and therefore stop existing. The more nuanced version is that for Anthropic to justify its existence, each time it pushes the frontier of capabilities should be earned by substantial progress on the other three points.
I think I have a stronger position on this than you do. I don’t think Anthropic should push the frontier of capabilities, even given the tradeoff it faces.
If their argument is “we know arms races are bad, but we have to accelerate arms races or else we can’t do alignment research,” they should be really really sure that they do, actually, have to do the bad thing to get the good thing. But I don’t think you can be that sure and I think the claim is actually less than 50% likely to be true.
I don’t take it for granted that Anthropic wouldn’t exist if it didn’t push the frontier. It could operate by intentionally lagging a bit behind other AI companies while still staying roughly competitive, and/or it could compete by investing harder in good UX. I suspect a (say) 25% worse model is not going to be much less profitable.
(This is a weaker argument but) If it does turn out that Anthropic really can’t exist without pushing the frontier and it has to close down, that’s probably a good thing. At the current level of investment in AI alignment research, I believe reducing arms race dynamics + reducing alignment research probably net decreases x-risk, and it would be better for this version of Anthropic not to exist. People at Anthropic probably disagree, but they should be very concerned that they have a strong personal incentive to disagree, and should be wary of their own bias. And they should be especially especially wary given that they hold the fate of humanity in their hands.
If lysine is your problem but you don’t want to eat beans, you can also buy lysine supplements.
I primarily use a weird ergonomic keyboard (the Kinesis Advantage 2) with custom key bindings. But my laptop keyboard has normal key bindings, so my “normal keyboard” muscle memory still works.
How does this strategy compare to shorting bonds? Both have the same payoff structure (they make money if the discount rate goes up) but it’s not clear to me which is a better deal. I suppose it depends on whether you expect Polymarket investors to have especially high demand for cash.