2022-03-19: someone talked to me for many hours about the scaling hypothesis, and i’ve now updated to shorter timelines; i havent thought about quantifying the update yet, but I can see a path to AGI by 2040 now
(as usual, conditional on understanding the question and the question making sense)
where human-level AGI means an AI better at any task than any human living in 2020, where biologically-improved and machine-augmented humans don’t count as AIs for the purpose of this question, but uploaded humans do
said AGI needs to be created by our civilisation
the probability density is the sum of my credences for different frequency of future worlds with AGIs
Without consulting my old prediction here, I answered someone asking me:
What is your probability mass for the date with > 50% chance of agi?
with:
I used to use the AGI definition “better and cheaper than humans at all economic tasks”, but now I think even if we’re dumber, we might still be better at some economic tasks simply because we know human values more. Maybe the definition could be “better and cheaper at any well defined tasks”. In that case, I’d say maybe 2080, taking into account some probability of economic stagnation and some probability that sub-AGI AIs cause an existential catastrophe (and so we don’t develop AGI)
probability nothing here is new, but it’s some insights I had
summary: alignment will likely become the bottleneck; we’ll have human-capable AIs but they won’t do every tasks because we won’t know how to specify them
epistemic status: stated more confidently than I am, but seems like a good consideration to add to my portfolio of plausible models of AI development
but then I asked my inner sim “when will we have an AI better than a human at any game”. the timeline for this seemed much shorter.
but a game is just task that has been operationalized.
so what my inner sim was saying is not that human-level capable AI was far away, but that human-level capable AND aligned AI was far away. I was imagining AIs wouldn’t clean-up my place anytime soon not because it’s hard to do (well, not for an AI XD), but because it’s hard to specify what we mean by “don’t cause any harm in the process”.
in other words, I think alignment is likely to be the bottleneck
the main problem won’t be to create an AI that can solve a problem, it will be to operationalize the problem in a way that properly captures what we care about. it won’t be about winning games, but creating them.
I should have known; I was well familiar with the orthogonality thesis for a long time
I discussed the above with Matthew Barnett and David Krueger
the Turing Test might be hard to pass because even if you’re as smart as a human, if you don’t already know what humans want, it seems like it could be hard to learn (as well as a human) for a human-level AI (?) (side note: learning what humans want =/= wanting what humans want; that’s a classic confusion) so maybe a better test for human-level intelligence would be: when an AI can beat a human at any game (where a game is a well operationalized task, and doesn’t include figuring out what humans want)
(2021-10-10 update: I’m not at all confident about the above paragraph, and it’s not central to this thesis. The Turing Test can be a well define game, and we could have AIs that pass it while not having AIs doing other tasks human can do simply because we haven’t been able to operationalize those other tasks)
I want to update my AI timelines. I’m now (re)reading some stuff on https://aiimpacts.org/ (I think they have a lot of great writings!) Just read this which was kind of related ^^
> Hanson thinks we shouldn’t believe it when AI researchers give 50-year timescales: > Rephrasing the question in different ways, e.g. “When will most people lose their jobs?” causes people to give different timescales. > People consistently give overconfident estimates when they’re estimating things that are abstract and far away.
I feel like for me it was the other way around. Initially I was just thinking more abstractly about “AIs better than humans at everything”, but then thinking in terms of games seems like it’s somewhat more concrete.
The key observation is, imitation learning algorithms[1] might produce close-to-human-level intelligence even if they are missing important ingredients of general intelligence that humans have.
2022-03-19: someone talked to me for many hours about the scaling hypothesis, and i’ve now updated to shorter timelines; i havent thought about quantifying the update yet, but I can see a path to AGI by 2040 now
(as usual, conditional on understanding the question and the question making sense)
where human-level AGI means an AI better at any task than any human living in 2020, where biologically-improved and machine-augmented humans don’t count as AIs for the purpose of this question, but uploaded humans do
said AGI needs to be created by our civilisation
the probability density is the sum of my credences for different frequency of future worlds with AGIs
https://elicit.ought.org/builder/ELNjdTVj-
1% it already happened
52% it won’t happen (most likely because we’ll go extinct or stop being simulated)
26% after 2100
EtA: moved a 2% from >2100 to <2050
Someone asked me:
partly just wide priors and I didn’t update much, partly Hansonian view, partly Drexler’s view, partly seems like a hard problem
Note to self; potentially to read:
https://aiimpacts.org/interviews-on-plausibility-of-ai-safety-by-default/
Your percentiles:
5th: 2040-10-01
25th: above 2100-01-01
50th: above 2100-01-01
75th: above 2100-01-01
95th: above 2100-01-01
XD
Update: 18% <2033 18% 2033-2043 18% 2043-2053 18% 2050-2070 28% 2070+ or won’t happen
see more details on my shortform: https://www.lesswrong.com/posts/DLepxRkACCu8SGqmT/mati_roy-s-shortform?commentId=KjxnsyB7EqdZAuLri
Without consulting my old prediction here, I answered someone asking me:
with:
will start tracking some of the things I read on this here:
2020-09-28: finished the summary of Conversation with Robin Hanson
note to self—read:
Draft report on AI timelines—comment summary
topic: AI timelines
probability nothing here is new, but it’s some insights I had
summary: alignment will likely become the bottleneck; we’ll have human-capable AIs but they won’t do every tasks because we won’t know how to specify them
epistemic status: stated more confidently than I am, but seems like a good consideration to add to my portfolio of plausible models of AI development
when I was asking my inner sim “when will we have an AI better than a human at any task”, it was returning 21% before 2100 (52% we won’t) (see: https://www.lesswrong.com/posts/hQysqfSEzciRazx8k/forecasting-thread-ai-timelines?commentId=AhA3JsvwaZ7h6JbJj). which is a low probability among AI researchers and longtermist forecasters.
but then I asked my inner sim “when will we have an AI better than a human at any game”. the timeline for this seemed much shorter.
but a game is just task that has been operationalized.
so what my inner sim was saying is not that human-level capable AI was far away, but that human-level capable AND aligned AI was far away. I was imagining AIs wouldn’t clean-up my place anytime soon not because it’s hard to do (well, not for an AI XD), but because it’s hard to specify what we mean by “don’t cause any harm in the process”.
in other words, I think alignment is likely to be the bottleneck
the main problem won’t be to create an AI that can solve a problem, it will be to operationalize the problem in a way that properly captures what we care about. it won’t be about winning games, but creating them.
I should have known; I was well familiar with the orthogonality thesis for a long time
also see David’s comment about alignment vs capabilities: https://www.lesswrong.com/posts/DmLg3Q4ZywCj6jHBL/capybaralet-s-shortform?commentId=rdGAv6S6W3SbK6eta
I discussed the above with Matthew Barnett and David Krueger
the Turing Test might be hard to pass because even if you’re as smart as a human, if you don’t already know what humans want, it seems like it could be hard to learn (as well as a human) for a human-level AI (?) (side note: learning what humans want =/= wanting what humans want; that’s a classic confusion) so maybe a better test for human-level intelligence would be: when an AI can beat a human at any game (where a game is a well operationalized task, and doesn’t include figuring out what humans want)
(2021-10-10 update: I’m not at all confident about the above paragraph, and it’s not central to this thesis. The Turing Test can be a well define game, and we could have AIs that pass it while not having AIs doing other tasks human can do simply because we haven’t been able to operationalize those other tasks)
I want to update my AI timelines. I’m now (re)reading some stuff on https://aiimpacts.org/ (I think they have a lot of great writings!) Just read this which was kind of related ^^
> Hanson thinks we shouldn’t believe it when AI researchers give 50-year timescales:
> Rephrasing the question in different ways, e.g. “When will most people lose their jobs?” causes people to give different timescales.
> People consistently give overconfident estimates when they’re estimating things that are abstract and far away.
(https://aiimpacts.org/conversation-with-robin-hanson/)
I feel like for me it was the other way around. Initially I was just thinking more abstractly about “AIs better than humans at everything”, but then thinking in terms of games seems like it’s somewhat more concrete.
x-post: https://www.facebook.com/mati.roy.09/posts/10158870283394579
Related predictions:
When will economic growth accelerate?
See Vanessa Kosoy’s post