The relevant question is not “will an AGI automatically undergo recursive self-improvement”, but “how likely is it that at least one of the early AGIs undergoes recursive self-improvement”.
That’s true, and we need an organisation like the SIAI to take care of that issue. But I still have a perception of harsh overconfidence around here when it comes issues related to risks from AI. It is not clear to me that dangerous recursive self-improvement is easier to achieve than friendliness.
To destroy is easier than to create. But destroying human values by means of unbounded recursive self-improvement seems to me to be one of the most complex existential risks.
The usual difference that is being highlighted around here is how easy it is to create simple goals versus complex goals, e.g. creating paperclips versus the protection of human values. But recursive self-improvement is a goal in and of itself. An artificial agent does not discern between a destination and the route to reach it, it has to be defined in terms of the AI’s optimization parameters. It doesn’t just happen, it is something very complex that needs to be explicitly defined.
So how likely is it? You need an AGI that is, in my opinion, explicitly defined and capable of unbounded and uncontrollable recursive self-improvement. There need to be internal causation’s that prompt it to keep going in the face of countless undefined challenges.
Something that could take over the world seems to me to be the endpoint of a very long and slow route towards a thorough understanding of many different fields, nothing that one could stumble upon early on and by accident.
The conservative assumption is that AGI is easy, and FAI is hard.
I don’t know if this is actually true. I think FAI is harder than AGI, but I’m very much not a specialist in the area—either area. However, I do know that I’d very much rather overshoot the required safety margin by a mile than undershoot by a meter.
“FAI” here generally means “Friendly AGI”, which would make “FAI is harder than AGI” trivially true.
Perhaps you meant one of the following more interesting propositions:
“(The sub-problem of) AGI is harder than (the sub-problem of) Friendliness.”
“AGI is sufficiently hard relative to Friendliness, such that by the time AGI is solved, Friendliness is unlikely to have been solved.”
(Assuming even the sub-problem of Friendliness still has prerequisite part or all of AGI, the latter proposition implies “Friendliness isn’t so easy relative to AGI such that progress on Friendliness will lag insignificantly behind progress on AGI.”)
Something that could take over the world seems to me to be the endpoint of a very long and slow route towards a thorough understanding of many different fields, nothing that one could stumble upon early on and by accident.
Humans are graduallly taking over the world. In IT the nearest thing we have see so far is probably operating systems like Microsoft’s Windows. Interestingly, Microsoft’s world domination plans seem to have been foiled by the US DoJ. A non-governmental intelligent machine seems likely to encounter the same issue.
Few seem to think about these anti-trust issues. If we are likely to face a corporate superintelligence, they seem kind-of significant to me.
That’s true, and we need an organisation like the SIAI to take care of that issue. But I still have a perception of harsh overconfidence around here when it comes issues related to risks from AI. It is not clear to me that dangerous recursive self-improvement is easier to achieve than friendliness.
To destroy is easier than to create. But destroying human values by means of unbounded recursive self-improvement seems to me to be one of the most complex existential risks.
The usual difference that is being highlighted around here is how easy it is to create simple goals versus complex goals, e.g. creating paperclips versus the protection of human values. But recursive self-improvement is a goal in and of itself. An artificial agent does not discern between a destination and the route to reach it, it has to be defined in terms of the AI’s optimization parameters. It doesn’t just happen, it is something very complex that needs to be explicitly defined.
So how likely is it? You need an AGI that is, in my opinion, explicitly defined and capable of unbounded and uncontrollable recursive self-improvement. There need to be internal causation’s that prompt it to keep going in the face of countless undefined challenges.
Something that could take over the world seems to me to be the endpoint of a very long and slow route towards a thorough understanding of many different fields, nothing that one could stumble upon early on and by accident.
The conservative assumption is that AGI is easy, and FAI is hard.
I don’t know if this is actually true. I think FAI is harder than AGI, but I’m very much not a specialist in the area—either area. However, I do know that I’d very much rather overshoot the required safety margin by a mile than undershoot by a meter.
“FAI” here generally means “Friendly AGI”, which would make “FAI is harder than AGI” trivially true.
Perhaps you meant one of the following more interesting propositions:
“(The sub-problem of) AGI is harder than (the sub-problem of) Friendliness.”
“AGI is sufficiently hard relative to Friendliness, such that by the time AGI is solved, Friendliness is unlikely to have been solved.”
(Assuming even the sub-problem of Friendliness still has prerequisite part or all of AGI, the latter proposition implies “Friendliness isn’t so easy relative to AGI such that progress on Friendliness will lag insignificantly behind progress on AGI.”)
Humans are graduallly taking over the world. In IT the nearest thing we have see so far is probably operating systems like Microsoft’s Windows. Interestingly, Microsoft’s world domination plans seem to have been foiled by the US DoJ. A non-governmental intelligent machine seems likely to encounter the same issue.
Few seem to think about these anti-trust issues. If we are likely to face a corporate superintelligence, they seem kind-of significant to me.