Fwiw, in 2016 I would have put something like 20% probability on what became known as ‘the scaling hypothesis’. I still had past-2035 median timelines, though.
What did you mean exactly in 2016 by the scaling hypothesis ?
Having past 2035 timelines and believing in the pure scaling maximalist hypothesis (which fwiw i don’t believe in for reasons i have explained elsewhere) are in direct conflict so id be curious if you could more exactly detail your beliefs back then.
What did you mean exactly in 2016 by the scaling hypothesis ?
Something like ‘we could have AGI just by scaling up deep learning / deep RL, without any need for major algorithmic breakthroughs’.
Having past 2035 timelines and believing in the pure scaling maximalist hypothesis (which fwiw i don’t believe in for reasons i have explained elsewhere) are in direct conflict so id be curious if you could more exactly detail your beliefs back then.
I’m not sure this is strictly true, though I agree with the ‘vibe’. I think there were probably a couple of things in play:
I still only had something like 20% on scaling, and I expected much more compute would likely be needed, especially in that scenario, but also more broadly (e.g. maybe something like the median in ‘bioanchors’ − 35 OOMs of pretraining-equivalent compute, if I don’t misremember; though I definitely hadn’t thought very explicitly about how many OOMs of compute at that time) - so I thought it would probably take decades to get to the required amount of compute.
I very likely hadn’t thought hard and long enough to necessarily integrate/make coherent my various beliefs.
Probably at least partly because there seemed to be a lot of social pressure from academic peers against even something like ’20% on scaling’, and even against taking AGI and AGI safety seriously at all. This likely made it harder to ‘viscerally feel’ what some of my beliefs might imply, and especially that it might happen very soon (which also had consequences in delaying when I’d go full-time into working on AI safety; along with thinking I’d have more time to prepare for it, before going all in).
Fwiw, in 2016 I would have put something like 20% probability on what became known as ‘the scaling hypothesis’. I still had past-2035 median timelines, though.
What did you mean exactly in 2016 by the scaling hypothesis ?
Having past 2035 timelines and believing in the pure scaling maximalist hypothesis (which fwiw i don’t believe in for reasons i have explained elsewhere) are in direct conflict so id be curious if you could more exactly detail your beliefs back then.
Something like ‘we could have AGI just by scaling up deep learning / deep RL, without any need for major algorithmic breakthroughs’.
I’m not sure this is strictly true, though I agree with the ‘vibe’. I think there were probably a couple of things in play:
I still only had something like 20% on scaling, and I expected much more compute would likely be needed, especially in that scenario, but also more broadly (e.g. maybe something like the median in ‘bioanchors’ − 35 OOMs of pretraining-equivalent compute, if I don’t misremember; though I definitely hadn’t thought very explicitly about how many OOMs of compute at that time) - so I thought it would probably take decades to get to the required amount of compute.
I very likely hadn’t thought hard and long enough to necessarily integrate/make coherent my various beliefs.
Probably at least partly because there seemed to be a lot of social pressure from academic peers against even something like ’20% on scaling’, and even against taking AGI and AGI safety seriously at all. This likely made it harder to ‘viscerally feel’ what some of my beliefs might imply, and especially that it might happen very soon (which also had consequences in delaying when I’d go full-time into working on AI safety; along with thinking I’d have more time to prepare for it, before going all in).