Median estimate for when they’ll start working on a serious code project (i.e., not just toy code to illustrate theorems) is 2017.
This will not necessarily be development of friendly AI—maybe a component of friendly AI, maybe something else. (I have no strong estimates for what that other thing would be, but just as an example—a simulated-world sandbox).
Everything I say above (and elsewhere), is my opinion, not MIRIs.
Median estimate for when they’ll start working on friendly AI, if they get started with that before the Singularity, and if their direction doesn’t shift away from their apparent current long-term plans to do so: 2025.
Median estimate for when they’ll start working on friendly AI, if they get started with that before the Singularity, and if their direction doesn’t shift away from their apparent current long-term plans to do so: 2025.
Yes, but not because of MIRI. Along with FHI, they are doing more than anyone to improve our odds. As to whether writing code or any other strategy is the right one—I don’t know, but I trust MIRI more than anyone to get that right.
I am pretty sure, that as soon as first AGI’s arrive on the market, people would start to take possible dangers more seriously.
And it will be quite likely at that point that we are much closer to having an AGI that will foom than to having an AI that won’t kill us and that it is too late.
I know it is a local trope that death and destruction is apparent and necessary logical conclusion of creating an intelligent machine capable of self improvement and goal modification, but I certainly don’t share those sentiments.
How do you estimate the probability that AGI’s won’t take over the world (people who constructed them may use them for that purpose, but it is a different story), and would be used as simple tools and advisors in the same way boring, old fashioned and safe way 100% of our current technology is used?
I am explicitly saying that MRI or FAI are pointless, or anything like that. I just want to point out that they posture as if they were saving the world from imminent destruction, while it is no where certain weather said danger is really the case.
How do you estimate the probability that AGI’s won’t take over the world (people who constructed them may use them for that purpose, but it is a different story), and would be used as simple tools and advisors in the same way boring, old fashioned and safe way 100% of our current technology is used?
1%? I believe that it is nearly impossible to use a foomed AI in a safe manner without explicitly trying to do so. That’s kind of why I am worried about the threat of any uFAI developed before it is proven that we can develop a Friendly one and without using whatever the proof entails.
Anyway,
...would be used as simple tools and advisors in the same way boring, old fashioned and safe way 100% of our current technology is used?
I wasn’t aware that we use a 100% of our current technology in a safe way.
You may have a different picture of current technology than I do, or you may be extrapolating different aspects. We’re already letting software optimize the external world directly, with slightly worrying results. You don’t get from here to strictly and consistently limited Oracle AI without someone screaming loudly about risks. In addition, Oracle AI has its own problems (tell me if the LW search function doesn’t make this clear).
Some critics appear to argue that the direction of current tech will automatically produce CEV. But today’s programs aim to maximize a behavior, such as disgorging money. I don’t know in detail how Google filters its search results, but I suspect they want to make you feel more comfortable with links they show you, thus increasing clicks or purchases from sometimes unusually dishonest ads. They don’t try to give you whatever information a smarter, better informed you would want your current self to have. Extrapolating today’s Google far enough doesn’t give you a Friendly AI, it gives you the making of a textbook dystopia.
The distribution is asymmetric for obvious reasons. The probability for 2014 is pretty close to zero. This means that there is a 50% probability that a serious code project will start after 2020.
This is inconsistent with 2017 being a median estimate.
Over the years, I have seen a few online comments about toy programs written by MIRI people, e.g., this, search for “Haskell”. But I don’t know anything more about these programs that those brief reports.
I’ve talked to a former grad student (fiddlemath, AKA Matt Elder) who worked on formal verification, and he said current methods are not anywhere near up to the task of formally verifying an FAI. Does MIRI have a formal verification research program? Do they have any plans to build programming processes like this or this?
I don’t know anything about MIRI research strategy than is publicly available, but if you look at what they are working on, it is all in the direction of formal verification.
After speaking to experts in formal verification of chips and of other systems, and they have confirmed what you learned from fiddlemath. Formal verification is limited in its capabilities: Often, you can only verify some very low-level or very specific assertions. And you have to be able to specify the assertion that you are verifying.
So, it seems that they are taking on a very difficult challenge.
Meta: I think this was an important thing to say, and to say forcefully, but it might have been worth expending a sentence or so to say it more nicely (but still as forcefully). (I don’t want to derail the thread and will say no more than this unless specifically asked.)
I don’t think I have anything to say that hasn’t been said better by others in MIRI and FHI, but I think that AI boxing is impossible because (1) it can convince any gatekeepers to let it out and (2) any AI is “embodied” and not separate from the outside world if only in that its circuits pass electrons, and (3) I doubt you could convince all AGI reseachers to keep their projects isolated.
Still, I think that AI boxing could be a good stopgap measure, one of a number of techniques that are ultimately ineffectual, but could still be used to slightly hold back the danger.
My question is similar to the one that Apprentice posed below. Here are my probability estimates of unfriendly and friendly AI, what are yours? And more importantly, where do you draw the line, what probability estimate would be low enough for you to drop the AI business from your consideration?
what probability estimate would be low enough for you to drop the AI business from your consideration?
Even a fairly low probability estimate would justify effort on an existential risk.
And I have to admit, a secondary, personal, reason for being involved is that the topic is fascinating and there are smart people here, though that of course does not shift the estimates of risk and of the possibilities of mitigating it.
What probability would you assign to this statement: “UFAI will be relatively easy to create within the next 100 years. FAI is so difficult that it will be nearly impossible to create within the next 200 years.”
I think that the estimates cannot be undertaken independently. FAI and UFAI would each pre-empt the other. So I’ll rephrase a little.
I estimate the chances that some AGI (in the sense of “roughly human-level AI”) will be built within the next 100 years as 85%, which is shorthand for “very high, but I know that probability estimates near 100% are often overconfident; and something unexpected can come up.”
And “100 years” here is shorthand for “as far off as we can make reasonable estimates/guesses about the future of humanity”; perhaps “50 years” should be used instead.
Conditional on some AGI being built, I estimate the chances that it will be unfriendly as 80%, which is shorthand for “by default it will be unfriendly, but people are working on avoiding that and they have some small chance of succeeding; or there might be some other unexpected reason that it will turn out friendly.”
Thank you. I didn’t phrase my question very well but what I was trying to get at was whether making a friendly AGI might be, by some measurement, orders of magnitude more difficult than making a non-friendly one.
Yes, it is orders of magnitude more different. If we took a hypothetical FAI-capable team, how much less time would it take them to make a UFAI than a FAI, assuming similar levels of effort, and starting at today’s knowledge levels?
I’m a Research Associate at MIRI. I became a supporter in late 2005, then contributed to research and publication in various ways. Please, AMA.
Opinions I express here and elsewhere are mine alone, not MIRI’s.
To be clear, as an Associate, I am an outsider to the MIRI team (who collaborates with them in various ways).
When do you estimate that MIRI will start writing the code for a friendly AI?
Median estimate for when they’ll start working on a serious code project (i.e., not just toy code to illustrate theorems) is 2017.
This will not necessarily be development of friendly AI—maybe a component of friendly AI, maybe something else. (I have no strong estimates for what that other thing would be, but just as an example—a simulated-world sandbox).
Everything I say above (and elsewhere), is my opinion, not MIRIs. Median estimate for when they’ll start working on friendly AI, if they get started with that before the Singularity, and if their direction doesn’t shift away from their apparent current long-term plans to do so: 2025.
This is not a MIRI official estimate and you really should have disclaimed that.
OK, I will edit this one as well to say that.
We’re so screwed, aren’t we?
Yes, but not because of MIRI. Along with FHI, they are doing more than anyone to improve our odds. As to whether writing code or any other strategy is the right one—I don’t know, but I trust MIRI more than anyone to get that right.
Oh yes, I know that. It just says a lot that our best shot is still decades away from achieving it’s goal.
Which, to be fair, isn’t saying much.
Seeing as we are talking about speculative dangers coming from a speculative technology that has yet to be developed, it seems pretty understandable.
I am pretty sure, that as soon as first AGI’s arrive on the market, people would start to take possible dangers more seriously.
And it will be quite likely at that point that we are much closer to having an AGI that will foom than to having an AI that won’t kill us and that it is too late.
I know it is a local trope that death and destruction is apparent and necessary logical conclusion of creating an intelligent machine capable of self improvement and goal modification, but I certainly don’t share those sentiments.
How do you estimate the probability that AGI’s won’t take over the world (people who constructed them may use them for that purpose, but it is a different story), and would be used as simple tools and advisors in the same way boring, old fashioned and safe way 100% of our current technology is used?
I am explicitly saying that MRI or FAI are pointless, or anything like that. I just want to point out that they posture as if they were saving the world from imminent destruction, while it is no where certain weather said danger is really the case.
1%? I believe that it is nearly impossible to use a foomed AI in a safe manner without explicitly trying to do so. That’s kind of why I am worried about the threat of any uFAI developed before it is proven that we can develop a Friendly one and without using whatever the proof entails.
Anyway,
I wasn’t aware that we use a 100% of our current technology in a safe way.
You may have a different picture of current technology than I do, or you may be extrapolating different aspects. We’re already letting software optimize the external world directly, with slightly worrying results. You don’t get from here to strictly and consistently limited Oracle AI without someone screaming loudly about risks. In addition, Oracle AI has its own problems (tell me if the LW search function doesn’t make this clear).
Some critics appear to argue that the direction of current tech will automatically produce CEV. But today’s programs aim to maximize a behavior, such as disgorging money. I don’t know in detail how Google filters its search results, but I suspect they want to make you feel more comfortable with links they show you, thus increasing clicks or purchases from sometimes unusually dishonest ads. They don’t try to give you whatever information a smarter, better informed you would want your current self to have. Extrapolating today’s Google far enough doesn’t give you a Friendly AI, it gives you the making of a textbook dystopia.
What are the error bars around these estimates?
The first estimate: 50% probability between 2015 and 2020.
The second estimate: 50% probability between 2020 and 2035. (again, taking into account all the conditioning factors).
Um.
The distribution is asymmetric for obvious reasons. The probability for 2014 is pretty close to zero. This means that there is a 50% probability that a serious code project will start after 2020.
This is inconsistent with 2017 being a median estimate.
Unless he thinks it’s very unlikely the project will start between 2017 and 2020 for some reason.
Good point. I’ll have to re-think that estimate and improve it.
If some rich individual were to donate 100 million USD to MIRI today, how would you revise your estimate (if at all)?
Can you elaborate on the types of toy code that you (or others) have tried in terms of illustrating theoreoms?
I have not tried any.
Over the years, I have seen a few online comments about toy programs written by MIRI people, e.g., this, search for “Haskell”. But I don’t know anything more about these programs that those brief reports.
I’ve talked to a former grad student (fiddlemath, AKA Matt Elder) who worked on formal verification, and he said current methods are not anywhere near up to the task of formally verifying an FAI. Does MIRI have a formal verification research program? Do they have any plans to build programming processes like this or this?
I don’t know anything about MIRI research strategy than is publicly available, but if you look at what they are working on, it is all in the direction of formal verification.
After speaking to experts in formal verification of chips and of other systems, and they have confirmed what you learned from fiddlemath. Formal verification is limited in its capabilities: Often, you can only verify some very low-level or very specific assertions. And you have to be able to specify the assertion that you are verifying.
So, it seems that they are taking on a very difficult challenge.
Your published dissertation sounds fascinating, but I swore off paper books. Can you share it in digital form?
Sure, I’ll send it to you. If anyone else wants it, please contact me. I always knew that Semitic Noun Patterns would be a best seller :-)
(Problem solved, comment deleted.)
Meta: I think this was an important thing to say, and to say forcefully, but it might have been worth expending a sentence or so to say it more nicely (but still as forcefully). (I don’t want to derail the thread and will say no more than this unless specifically asked.)
Will do.
Thx!
What do you think is the liklihood of AI boxing being successful and why (interested in reasons, not numbers).
I don’t think I have anything to say that hasn’t been said better by others in MIRI and FHI, but I think that AI boxing is impossible because (1) it can convince any gatekeepers to let it out and (2) any AI is “embodied” and not separate from the outside world if only in that its circuits pass electrons, and (3) I doubt you could convince all AGI reseachers to keep their projects isolated.
Still, I think that AI boxing could be a good stopgap measure, one of a number of techniques that are ultimately ineffectual, but could still be used to slightly hold back the danger.
My question is similar to the one that Apprentice posed below. Here are my probability estimates of unfriendly and friendly AI, what are yours? And more importantly, where do you draw the line, what probability estimate would be low enough for you to drop the AI business from your consideration?
Even a fairly low probability estimate would justify effort on an existential risk.
And I have to admit, a secondary, personal, reason for being involved is that the topic is fascinating and there are smart people here, though that of course does not shift the estimates of risk and of the possibilities of mitigating it.
What probability would you assign to this statement: “UFAI will be relatively easy to create within the next 100 years. FAI is so difficult that it will be nearly impossible to create within the next 200 years.”
I think that the estimates cannot be undertaken independently. FAI and UFAI would each pre-empt the other. So I’ll rephrase a little.
I estimate the chances that some AGI (in the sense of “roughly human-level AI”) will be built within the next 100 years as 85%, which is shorthand for “very high, but I know that probability estimates near 100% are often overconfident; and something unexpected can come up.”
And “100 years” here is shorthand for “as far off as we can make reasonable estimates/guesses about the future of humanity”; perhaps “50 years” should be used instead.
Conditional on some AGI being built, I estimate the chances that it will be unfriendly as 80%, which is shorthand for “by default it will be unfriendly, but people are working on avoiding that and they have some small chance of succeeding; or there might be some other unexpected reason that it will turn out friendly.”
Thank you. I didn’t phrase my question very well but what I was trying to get at was whether making a friendly AGI might be, by some measurement, orders of magnitude more difficult than making a non-friendly one.
Yes, it is orders of magnitude more different. If we took a hypothetical FAI-capable team, how much less time would it take them to make a UFAI than a FAI, assuming similar levels of effort, and starting at today’s knowledge levels?
One-tenth the time seems like a good estimate.