Yeah, to be more clear, it’s not so much that she got that wrong. It’s that she didn’t accept the reasoning behind that number enough to really believe it. She added a discount factor based on fallacious reasoning around “if it were that easy, it’d be here already”.
Some hypotheses predict that we should have already been able to afford to train a transformative model with reasonable probability; I think this is unlikely, so I execute an update against low levels of FLOP...
In my opinion the correct takeaway from an estimate showing that “a low number of FLOPs should be enough if the right algorithm is used” combined with “we don’t seem close, and are using more FLOPs than that already” is “therefore, our algorithms must be pretty crap and there must be lots of room for algorithmic improvement.” Thus, that we are in a compute hardware overhang and that algorithmic improvement will potentially result in huge capabilities gains, speed gains, and parallel inference instances. Of course, if our algorithms are really that far from optimal, why should we not expect to continue to be bottlenecked by algorithms? The conclusion I come to is that if we can compensate for inefficient algorithms with huge over-expenditure in compute, then we can get a just-barely-good-enough ML research assistant who can speed up the algorithm progress. So we should expect training costs to drop rapidly after automating ML R&D.
Then she also blended the human brain compute estimate with three other estimates which were multiple orders of magnitude larger. These other estimates were based on, in my opinion, faulty premises.
I first lay out four hypotheses about 2020 training computation requirements, each of which anchors on a key quantity estimated from biology: total computation done over evolution, total computation done over a human lifetime, the computational power of the human brain, and the amount of information in the human genome...
Since the time of her report she has had to repeatedly revise her timelines down. Now she basically agrees with me. Props to her for correcting.
It’s that she didn’t accept the reasoning behind that number enough to really believe it. She added a discount factor based on fallacious reasoning around “if it were that easy, it’d be here already”.
Just to clarify: There was no such discount factor that changed the median estimate of “human brain compute”. Instead, this discount factor was applied to go from “human brain compute estimate” to “human-brain-compute-informed estimate of the compute-cost of training TAI with current algorithms” — adjusting for how our current algorithm seem to be worse than those used to run the human brain. (As you mention and agree with, although I infer that you expect algorithmic progress to be faster than Ajeya did at the time.) The most relevant section is here.
Yeah, to be more clear, it’s not so much that she got that wrong. It’s that she didn’t accept the reasoning behind that number enough to really believe it. She added a discount factor based on fallacious reasoning around “if it were that easy, it’d be here already”.
In my opinion the correct takeaway from an estimate showing that “a low number of FLOPs should be enough if the right algorithm is used” combined with “we don’t seem close, and are using more FLOPs than that already” is “therefore, our algorithms must be pretty crap and there must be lots of room for algorithmic improvement.” Thus, that we are in a compute hardware overhang and that algorithmic improvement will potentially result in huge capabilities gains, speed gains, and parallel inference instances. Of course, if our algorithms are really that far from optimal, why should we not expect to continue to be bottlenecked by algorithms? The conclusion I come to is that if we can compensate for inefficient algorithms with huge over-expenditure in compute, then we can get a just-barely-good-enough ML research assistant who can speed up the algorithm progress. So we should expect training costs to drop rapidly after automating ML R&D.
Then she also blended the human brain compute estimate with three other estimates which were multiple orders of magnitude larger. These other estimates were based on, in my opinion, faulty premises.
Since the time of her report she has had to repeatedly revise her timelines down. Now she basically agrees with me. Props to her for correcting.
https://www.lesswrong.com/posts/K2D45BNxnZjdpSX2j/ai-timelines?commentId=hnrfbFCP7Hu6N6Lsp
Ok, gotcha.
Just to clarify: There was no such discount factor that changed the median estimate of “human brain compute”. Instead, this discount factor was applied to go from “human brain compute estimate” to “human-brain-compute-informed estimate of the compute-cost of training TAI with current algorithms” — adjusting for how our current algorithm seem to be worse than those used to run the human brain. (As you mention and agree with, although I infer that you expect algorithmic progress to be faster than Ajeya did at the time.) The most relevant section is here.