Oh, to clarify, we’re not predicting AGI will be achieved by brain simulation. We’re using the human brain as a starting point for guessing how much compute AGI will need, and then applying a giant confidence interval (to account for cases where AGI is way more efficient, as well as way less efficient). It’s the most uncertain part of our analysis and we’re open to updating.
For posterity, by 2030, I predict we will not have:
AI drivers that work in any country
AI swim instructors
AI that can do all of my current job at OpenAI in 2023
AI that can get into a 2017 Toyota Prius and drive it
AI that cleans my home (e.g., laundry, dishwashing, vacuuming, and/or wiping)
AI retail workers
AI managers
AI CEOs running their own companies
Self-replicating AIs running around the internet acquiring resources
Thanks! AI managers, CEOs, self-replicators, and your-job-doers (what is your job anyway? I never asked!) seem like things that could happen before it’s too late (albeit only very shortly before) so they are potential sources of bets between us. (The other stuff requires lots of progress in robotics which I don’t expect to happen until after the singularity, though I could be wrong)
Yes, I understand that you don’t think AGI will be achieved by brain simulation. I like that you have a giant confidence interval to account for cases where AGI is way more efficient and way less efficient. I’m saying something has gone wrong with your confidence interval if the median is 8-10 OOMs more inference cost than GPT-4, given how powerful GPT-4 is. Subjectively GPT-4 seems pretty close to AGI, in the sense of being able to automate all strategically relevant tasks that can be done by human remote worker professionals. It’s not quite there yet, but looking at the progress from GPT-2 to GPT-3 to GPT-4, it seems like maybe GPT-5 or GPT-6 would do it. But the middle of your confidence interval says that we’ll need something like GPT-8, 9, or 10. This might be justified a priori, if all we had to go on was comparisons to the human brain, but I think a posteriori we should update on observing the GPT series so far and shift our probability mass significantly downwards. To put it another way, imagine if you had done this analysis in 2018, immediately after seeing GPT-1. Had someone at that time asked you to predict when e.g. a single AI system could pass the Bar, the LSAT, the SAT, and also do lots of coding interview problems and so forth, you probably would have said something like “Hmmm, human brain size anchor suggests we need another 12-15 OOMs from GPT-1 size to achieve AGI. Those skills seem pretty close to AGI, but idk maybe they are easier than I expect. I’ll say… 10 OOMs from GPT-1. Could be more, could be less.” Well, surprise! It was much less than that! (Of course I can’t speak for you, but I can speak for myself, and that’s what I would have said I had been thinking about the human brain anchor in 2018 and was convinced that 1e20-1e21 FLOP was the median of my distribution).
I think you’ve identified a good crux between us: I think GPT-4 is far from automating remote workers and you think it’s close. If GPT-5/6 automate most remote work, that will be point in favor of your view, and if takes until GPT-8/9/10+, that will be a point in favor of mine. And if GPT gradually provides increasingly powerful tools that wildly transform jobs before they are eventually automated away by GPT-7, then we can call it a tie. :)
I also agree that the magic of GPT should update one into believing in shorter AGI timelines with lower compute requirements. And you’re right, this framework anchored on the human brain can’t cleanly adjust from such updates. We didn’t want to overcomplicate our model, but perhaps we oversimplified here. (One defense is that the hugeness of our error bars mean that relatively large updates are needed to make a substantial difference in the CDF.)
Lastly, I think when we see GPT unexpectedly pass the Bar, LSAT, SAT, etc. but continue to fail at basic reasoning, it should update us into thinking AGI is sooner (vs a no pass scenario), but also update us into realizing these metrics might be further from AGI than we originally assumed based on human analogues.
Excellent! Yeah I think GPT-4 is close to automating remote workers. 5 or 6, with suitable extensions (e.g. multimodal, langchain, etc.) will succeed I think. Of course, there’ll be a lag between “technically existing AI systems can be made to ~fully automate job X” and “most people with job X are now unemployed” because things take time to percolate through the economy. But I think by the time of GPT-6 it’ll be clear that this percolation is beginning to happen & the sorts of things that employ remote workers in 2023 (especially the strategically relevant ones, the stuff that goes into AI R&D) are doable by the latest AIs.
It sounds like you think GPT will continue to fail at basic reasoning for some time? And that it currently fails at basic reasoning to a significantly greater extent than humans do? I’d be interested to hear more about this, what sort of examples do you have in mind? This might be another great crux between us.
I’m wondering if we could make this into a bet. If by remote workers we include programmers, then I’d be willing to bet that GPT-5/6, depending upon what that means (might be easier to say the top LLMs or other models trained by anyone by 2026?) will not be able to replace them.
I’ve made several bets like this in the past, but it’s a bit frustrating since I don’t stand to gain anything by winning—by the time I win the bet, we are well into the singularity & there isn’t much for me to do with the money anymore. What are the terms you have in mind? We could do the thing where you give me money now, and I give it back with interest later.
Andy will donate $50 to a charity of Daniel’s choice now.
If, by January 2027, there is not a report from a reputable source confirming that at least three companies, that would previously have relied upon programmers, and meet a defined level of success, are being run without the need for human programmers, due to the independent capabilities of an AI developed by OpenAI or another AI organization, then Daniel will donate $100, adjusted for inflation as of June 2023, to a charity of Andy’s choice.
Terms
Reputable Source: For the purpose of this bet, reputable sources include MIT Technology Review, Nature News, The Wall Street Journal, The New York Times, Wired, The Guardian, or TechCrunch, or similar publications of recognized journalistic professionalism. Personal blogs, social media sites, or tweets are excluded.
AI’s Capabilities: The AI must be capable of independently performing the full range of tasks typically carried out by a programmer, including but not limited to writing, debugging, maintaining code, and designing system architecture.
Equivalent Roles: Roles that involve tasks requiring comparable technical skills and knowledge to a programmer, such as maintaining codebases, approving code produced by AI, or prompting the AI with specific instructions about what code to write.
Level of Success: The companies must be generating a minimum annual revenue of $10 million (or likely generating this amount of revenue if it is not public knowledge).
Report: A single, substantive article or claim in one of the defined reputable sources that verifies the defined conditions.
AI Organization: An institution or entity recognized for conducting research in AI or developing AI technologies. This could include academic institutions, commercial entities, or government agencies.
I guess that there might be some disagreements in these terms, so I’d be curious to hear your suggested improvements.
Caveat: I don’t have much disposable money right now, so it’s not much money, but perhaps this is still interesting as a marker of our beliefs. Totally ok if it’s not enough money to be worth it to you.
Given your lack of disposable money I think this would be a bad deal for you, and as for me, it is sorta borderline (my credence that the bet will resolve in your favor is something like 40%?) but sure, let’s do it. As for what charity to donate to, how about Animal Welfare Fund | Effective Altruism Funds. Thanks for working out all these details!
Here are some grey area cases we should work out: --What if there is a human programmer managing the whole setup, but they are basically a formality? Like, the company does technically have programmers on staff but the programmers basically just form an interface between the company and ChatGPT and theoretically if the managers of the company were willing to spend a month learning how to talk to ChatGPT effectively they could fire the human programmers? --What if it’s clear that the reason you are winning the bet is that the government has stepped in to ban the relevant sorts of AI?
Sounds good, I’m happy with that arrangement once we get these details figured out.
Regarding the human programmer formality, it seems like business owners would have to be really incompetent for this to be a factor. Plenty of managers have coding experience. If the programmers aren’t doing anything useful then they will be let go or new companies will start that don’t have them. They are a huge expense. I’m inclined to not include this since it’s an ambiguity that seems implausible to me.
Regarding the potential ban by the government, I wasn’t really thinking of that as a possible option. What kind of ban do you have in mind? I imagine that regulation of AI is very likely by then, so if the automation of all programmers hasn’t happened by Jan 2027, it seems very easy to argue that it would have happened in the absence of the regulation.
Regarding these and a few of the other ambiguous things, one way we could do this is that you and I could just agree on it in Jan 2027. Otherwise, the bet resolves N/A and you don’t donate anything. This could make it an interesting Manifold question because it’s a bit adversarial. This way, we could also get rid of the requirement for it to be reported by a reputable source, which is going to be tricky to determine.
How about this: --Re the first grey area: We rule in your favor here. --Re the second grey area: You decide, in 2027, based on your own best judgment, whether or not it would have happened absent regulation. I can disagree with your judgment, but I still have to agree that you won the bet (if you rule in your favor).
Those sound good to me! I donated to your charity (the Animal Welfare Fund) to finalize it. Lmk if you want me to email you the receipt. Here’s the manifold market:
Bet
Andy will donate $50 to a charity of Daniel’s choice now.
If, by January 2027, there is not a report from a reputable source confirming that at least three companies, that would previously have relied upon programmers, and meet a defined level of success, are being run without the need for human programmers, due to the independent capabilities of an AI developed by OpenAI or another AI organization, then Daniel will donate $100, adjusted for inflation as of June 2023, to a charity of Andy’s choice.
Terms
Reputable Source: For the purpose of this bet, reputable sources include MIT Technology Review, Nature News, The Wall Street Journal, The New York Times, Wired, The Guardian, or TechCrunch, or similar publications of recognized journalistic professionalism. Personal blogs, social media sites, or tweets are excluded.
AI’s Capabilities: The AI must be capable of independently performing the full range of tasks typically carried out by a programmer, including but not limited to writing, debugging, maintaining code, and designing system architecture.
Equivalent Roles: Roles that involve tasks requiring comparable technical skills and knowledge to a programmer, such as maintaining codebases, approving code produced by AI, or prompting the AI with specific instructions about what code to write.
Level of Success: The companies must be generating a minimum annual revenue of $10 million (or likely generating this amount of revenue if it is not public knowledge).
Report: A single, substantive article or claim in one of the defined reputable sources that verifies the defined conditions.
AI Organization: An institution or entity recognized for conducting research in AI or developing AI technologies. This could include academic institutions, commercial entities, or government agencies.
Regulatory Impact: In January 2027, Andy will use his best judgment to decide whether the conditions of the bet would have been met in the absence of any government regulation restricting or banning the types of AI that would have otherwise replaced programmers.
But , a huge huge portion of human labor doesnt require basic reasoning. Its rote enough to use flowcharts , I don’t need my calculator to “understand” math , I need it to give me the correct answer.
And for the “hallucinating” behavior you can just have it learn not do to that by rote. Even if you still need 10% of a certain “discipline” (job) to double check that the AI isn’t making things up you’ve still increased productivity insanely.
And what does that profit and freed up capital do other than chase more profit and invest in things that draw down all the conditionals vastly?
5% increased productivity here , 3% over here , it all starts to multiply.
Oh, to clarify, we’re not predicting AGI will be achieved by brain simulation. We’re using the human brain as a starting point for guessing how much compute AGI will need, and then applying a giant confidence interval (to account for cases where AGI is way more efficient, as well as way less efficient). It’s the most uncertain part of our analysis and we’re open to updating.
For posterity, by 2030, I predict we will not have:
AI drivers that work in any country
AI swim instructors
AI that can do all of my current job at OpenAI in 2023
AI that can get into a 2017 Toyota Prius and drive it
AI that cleans my home (e.g., laundry, dishwashing, vacuuming, and/or wiping)
AI retail workers
AI managers
AI CEOs running their own companies
Self-replicating AIs running around the internet acquiring resources
Here are some of my predictions from the past:
Predictions about the year 2050, written 7ish years ago: https://www.tedsanders.com/predictions-about-the-year-2050/
Predictions on self-driving from 5 years ago: https://www.tedsanders.com/on-self-driving-cars/
Thanks! AI managers, CEOs, self-replicators, and your-job-doers (what is your job anyway? I never asked!) seem like things that could happen before it’s too late (albeit only very shortly before) so they are potential sources of bets between us. (The other stuff requires lots of progress in robotics which I don’t expect to happen until after the singularity, though I could be wrong)
Yes, I understand that you don’t think AGI will be achieved by brain simulation. I like that you have a giant confidence interval to account for cases where AGI is way more efficient and way less efficient. I’m saying something has gone wrong with your confidence interval if the median is 8-10 OOMs more inference cost than GPT-4, given how powerful GPT-4 is. Subjectively GPT-4 seems pretty close to AGI, in the sense of being able to automate all strategically relevant tasks that can be done by human remote worker professionals. It’s not quite there yet, but looking at the progress from GPT-2 to GPT-3 to GPT-4, it seems like maybe GPT-5 or GPT-6 would do it. But the middle of your confidence interval says that we’ll need something like GPT-8, 9, or 10. This might be justified a priori, if all we had to go on was comparisons to the human brain, but I think a posteriori we should update on observing the GPT series so far and shift our probability mass significantly downwards. To put it another way, imagine if you had done this analysis in 2018, immediately after seeing GPT-1. Had someone at that time asked you to predict when e.g. a single AI system could pass the Bar, the LSAT, the SAT, and also do lots of coding interview problems and so forth, you probably would have said something like “Hmmm, human brain size anchor suggests we need another 12-15 OOMs from GPT-1 size to achieve AGI. Those skills seem pretty close to AGI, but idk maybe they are easier than I expect. I’ll say… 10 OOMs from GPT-1. Could be more, could be less.” Well, surprise! It was much less than that! (Of course I can’t speak for you, but I can speak for myself, and that’s what I would have said I had been thinking about the human brain anchor in 2018 and was convinced that 1e20-1e21 FLOP was the median of my distribution).
Great points.
I think you’ve identified a good crux between us: I think GPT-4 is far from automating remote workers and you think it’s close. If GPT-5/6 automate most remote work, that will be point in favor of your view, and if takes until GPT-8/9/10+, that will be a point in favor of mine. And if GPT gradually provides increasingly powerful tools that wildly transform jobs before they are eventually automated away by GPT-7, then we can call it a tie. :)
I also agree that the magic of GPT should update one into believing in shorter AGI timelines with lower compute requirements. And you’re right, this framework anchored on the human brain can’t cleanly adjust from such updates. We didn’t want to overcomplicate our model, but perhaps we oversimplified here. (One defense is that the hugeness of our error bars mean that relatively large updates are needed to make a substantial difference in the CDF.)
Lastly, I think when we see GPT unexpectedly pass the Bar, LSAT, SAT, etc. but continue to fail at basic reasoning, it should update us into thinking AGI is sooner (vs a no pass scenario), but also update us into realizing these metrics might be further from AGI than we originally assumed based on human analogues.
Excellent! Yeah I think GPT-4 is close to automating remote workers. 5 or 6, with suitable extensions (e.g. multimodal, langchain, etc.) will succeed I think. Of course, there’ll be a lag between “technically existing AI systems can be made to ~fully automate job X” and “most people with job X are now unemployed” because things take time to percolate through the economy. But I think by the time of GPT-6 it’ll be clear that this percolation is beginning to happen & the sorts of things that employ remote workers in 2023 (especially the strategically relevant ones, the stuff that goes into AI R&D) are doable by the latest AIs.
It sounds like you think GPT will continue to fail at basic reasoning for some time? And that it currently fails at basic reasoning to a significantly greater extent than humans do? I’d be interested to hear more about this, what sort of examples do you have in mind? This might be another great crux between us.
I’m wondering if we could make this into a bet. If by remote workers we include programmers, then I’d be willing to bet that GPT-5/6, depending upon what that means (might be easier to say the top LLMs or other models trained by anyone by 2026?) will not be able to replace them.
I’ve made several bets like this in the past, but it’s a bit frustrating since I don’t stand to gain anything by winning—by the time I win the bet, we are well into the singularity & there isn’t much for me to do with the money anymore. What are the terms you have in mind? We could do the thing where you give me money now, and I give it back with interest later.
Understandable. How about this?
Bet
Andy will donate $50 to a charity of Daniel’s choice now.
If, by January 2027, there is not a report from a reputable source confirming that at least three companies, that would previously have relied upon programmers, and meet a defined level of success, are being run without the need for human programmers, due to the independent capabilities of an AI developed by OpenAI or another AI organization, then Daniel will donate $100, adjusted for inflation as of June 2023, to a charity of Andy’s choice.
Terms
Reputable Source: For the purpose of this bet, reputable sources include MIT Technology Review, Nature News, The Wall Street Journal, The New York Times, Wired, The Guardian, or TechCrunch, or similar publications of recognized journalistic professionalism. Personal blogs, social media sites, or tweets are excluded.
AI’s Capabilities: The AI must be capable of independently performing the full range of tasks typically carried out by a programmer, including but not limited to writing, debugging, maintaining code, and designing system architecture.
Equivalent Roles: Roles that involve tasks requiring comparable technical skills and knowledge to a programmer, such as maintaining codebases, approving code produced by AI, or prompting the AI with specific instructions about what code to write.
Level of Success: The companies must be generating a minimum annual revenue of $10 million (or likely generating this amount of revenue if it is not public knowledge).
Report: A single, substantive article or claim in one of the defined reputable sources that verifies the defined conditions.
AI Organization: An institution or entity recognized for conducting research in AI or developing AI technologies. This could include academic institutions, commercial entities, or government agencies.
Inflation Adjustment: The donation will be an equivalent amount of money as $100 as of June 2023, adjusted for inflation based on https://www.bls.gov/data/inflation_calculator.htm.
I guess that there might be some disagreements in these terms, so I’d be curious to hear your suggested improvements.
Caveat: I don’t have much disposable money right now, so it’s not much money, but perhaps this is still interesting as a marker of our beliefs. Totally ok if it’s not enough money to be worth it to you.
Given your lack of disposable money I think this would be a bad deal for you, and as for me, it is sorta borderline (my credence that the bet will resolve in your favor is something like 40%?) but sure, let’s do it. As for what charity to donate to, how about Animal Welfare Fund | Effective Altruism Funds. Thanks for working out all these details!
Here are some grey area cases we should work out:
--What if there is a human programmer managing the whole setup, but they are basically a formality? Like, the company does technically have programmers on staff but the programmers basically just form an interface between the company and ChatGPT and theoretically if the managers of the company were willing to spend a month learning how to talk to ChatGPT effectively they could fire the human programmers?
--What if it’s clear that the reason you are winning the bet is that the government has stepped in to ban the relevant sorts of AI?
Sounds good, I’m happy with that arrangement once we get these details figured out.
Regarding the human programmer formality, it seems like business owners would have to be really incompetent for this to be a factor. Plenty of managers have coding experience. If the programmers aren’t doing anything useful then they will be let go or new companies will start that don’t have them. They are a huge expense. I’m inclined to not include this since it’s an ambiguity that seems implausible to me.
Regarding the potential ban by the government, I wasn’t really thinking of that as a possible option. What kind of ban do you have in mind? I imagine that regulation of AI is very likely by then, so if the automation of all programmers hasn’t happened by Jan 2027, it seems very easy to argue that it would have happened in the absence of the regulation.
Regarding these and a few of the other ambiguous things, one way we could do this is that you and I could just agree on it in Jan 2027. Otherwise, the bet resolves N/A and you don’t donate anything. This could make it an interesting Manifold question because it’s a bit adversarial. This way, we could also get rid of the requirement for it to be reported by a reputable source, which is going to be tricky to determine.
How about this:
--Re the first grey area: We rule in your favor here.
--Re the second grey area: You decide, in 2027, based on your own best judgment, whether or not it would have happened absent regulation. I can disagree with your judgment, but I still have to agree that you won the bet (if you rule in your favor).
Those sound good to me! I donated to your charity (the Animal Welfare Fund) to finalize it. Lmk if you want me to email you the receipt. Here’s the manifold market:
Bet
Andy will donate $50 to a charity of Daniel’s choice now.
If, by January 2027, there is not a report from a reputable source confirming that at least three companies, that would previously have relied upon programmers, and meet a defined level of success, are being run without the need for human programmers, due to the independent capabilities of an AI developed by OpenAI or another AI organization, then Daniel will donate $100, adjusted for inflation as of June 2023, to a charity of Andy’s choice.
Terms
Reputable Source: For the purpose of this bet, reputable sources include MIT Technology Review, Nature News, The Wall Street Journal, The New York Times, Wired, The Guardian, or TechCrunch, or similar publications of recognized journalistic professionalism. Personal blogs, social media sites, or tweets are excluded.
AI’s Capabilities: The AI must be capable of independently performing the full range of tasks typically carried out by a programmer, including but not limited to writing, debugging, maintaining code, and designing system architecture.
Equivalent Roles: Roles that involve tasks requiring comparable technical skills and knowledge to a programmer, such as maintaining codebases, approving code produced by AI, or prompting the AI with specific instructions about what code to write.
Level of Success: The companies must be generating a minimum annual revenue of $10 million (or likely generating this amount of revenue if it is not public knowledge).
Report: A single, substantive article or claim in one of the defined reputable sources that verifies the defined conditions.
AI Organization: An institution or entity recognized for conducting research in AI or developing AI technologies. This could include academic institutions, commercial entities, or government agencies.
Inflation Adjustment: The donation will be an equivalent amount of money as $100 as of June 2023, adjusted for inflation based on https://www.bls.gov/data/inflation_calculator.htm.
Regulatory Impact: In January 2027, Andy will use his best judgment to decide whether the conditions of the bet would have been met in the absence of any government regulation restricting or banning the types of AI that would have otherwise replaced programmers.
Sounds good, thank you! Emailing the receipt would be nice.
Sounds good, can’t find your email address, DM’d you.
But , a huge huge portion of human labor doesnt require basic reasoning. Its rote enough to use flowcharts , I don’t need my calculator to “understand” math , I need it to give me the correct answer.
And for the “hallucinating” behavior you can just have it learn not do to that by rote. Even if you still need 10% of a certain “discipline” (job) to double check that the AI isn’t making things up you’ve still increased productivity insanely.
And what does that profit and freed up capital do other than chase more profit and invest in things that draw down all the conditionals vastly?
5% increased productivity here , 3% over here , it all starts to multiply.