Well for instance, certain approaches to AGI are more likely to lead to something friendly than other approaches are. If you believe that approach A is 1% less likely to lead to a bad outcome than approach B, then funding research in approach A is already compelling.
In my mind, a well-reasoned statistical approach with good software engineering methodologies is the mainstream approach that is least likely to lead to a bad outcome. It has the advantage that there is already a large amount of related research being done, hence there is actually a reasonable chance that such an AGI would be the first to be implemented. My personal estimate is that such an approach carries about 10% less risk than an alternative approach where the statistics and software are both hacked together.
In contrast, I estimate that SIAI’s FAI approach would carry about 90% less risk if implemented than a hacked-together AGI. However, I assign very low probability to SIAI’s current approach succeeding in time. I therefore consider the above-mentioned approach more effective.
Another alternative to SIAI that doesn’t require estimates about any specific research program would be to fund the creation of high-status AI researchers who care about Friendliness. Then they are free to steer the field as a whole towards whatever direction is determined to carry the least risk, after we have the chance to do further research to determine that direction.
My personal estimate is that such an approach carries about 10% less risk than an alternative approach where the statistics and software are both hacked together.
I don’t understand what you mean by “10% less risk”. Do you think any given project using “a well-reasoned statistical approach with good software engineering methodologies” has at least 10% chance of leading to a positive Singularity? Or each such project has a P*0.9 probability of causing an existential disaster, where P is the probability of disaster of a “hacked together” project. Or something else?
You said “I therefore consider the above-mentioned approach more effective.”, but if all you’re claiming is that the above mentioned approach (“a well-reasoned statistical approach with good software engineering methodologies”) has a P*0.9 probability of causing an existential disaster, and not claiming that it has a significant chance of causing a positive Singularity, then why do you think funding such projects is effective for reducing existential risk? Is the idea that each such project would displace a “hacked together” project that would otherwise be started?
EDIT: I originally misinterpreted your post slightly, and corrected my reply accordingly.
Not quite. The hope is that such a project will succeed before any other hacked-together project succeeds. More broadly, the hope is that partial successes using principled methodologies will convince them to be more widely adopted in the AI community as a whole, and more to the point that a contingent of highly successful AI researchers advocating Friendliness can change the overall mindset of the field.
The default is a hacked-together AI project. SIAI’s FAI research is trying to displace this, but I don’t think they will succeed (my information on this is purely outside-view, however).
An explicit instantiation of some of my calculations:
SIAI approach: 0.1% chance of replacing P with 0.1P
Approach that integrates with the rest of the AI community: 30% chance of replacing P with 0.9P
In the first case, P is basically staying constant, in the second case it is being replaced with 0.97P.
The only specific plan I have right now is to put myself in a position to hire smart people to work on this problem. I think the most robust way to do this is to get a faculty position somewhere, but I need to consider the higher relative efficiency of corporations over universities some more to figure out if it’s worthwhile to go with the higher-volatility route of industry.
Also, as Paul notes, I need to consider other approaches to x-risk reduction as well to see if I can do better than my current plan. The main argument in favor of my current plan is that there is a clear path to the goal, with only modest technical hurdles and no major social hurdles. I don’t particularly like plans that start to get fuzzier than that, but I am willing to be convinced that this is irrational.
EDIT: To be more explicit, my current goal is to become one of said high-status AI researchers. I am worried that this is slightly self-serving, although I think I have good reason to believe that I have a comparative advantage at this task.
Another alternative to SIAI that doesn’t require estimates about any specific research program would be to fund the creation of high-status AI researchers who care about Friendliness.
That seems more of an alternative within SIAI than an alternative to SIAI. With more funding, their Associate Research Program can promote the importance of Friendliness and increase the status of researchers who care about it.
Well for instance, certain approaches to AGI are more likely to lead to something friendly than other approaches are. If you believe that approach A is 1% less likely to lead to a bad outcome than approach B, then funding research in approach A is already compelling.
In my mind, a well-reasoned statistical approach with good software engineering methodologies is the mainstream approach that is least likely to lead to a bad outcome. It has the advantage that there is already a large amount of related research being done, hence there is actually a reasonable chance that such an AGI would be the first to be implemented. My personal estimate is that such an approach carries about 10% less risk than an alternative approach where the statistics and software are both hacked together.
In contrast, I estimate that SIAI’s FAI approach would carry about 90% less risk if implemented than a hacked-together AGI. However, I assign very low probability to SIAI’s current approach succeeding in time. I therefore consider the above-mentioned approach more effective.
Another alternative to SIAI that doesn’t require estimates about any specific research program would be to fund the creation of high-status AI researchers who care about Friendliness. Then they are free to steer the field as a whole towards whatever direction is determined to carry the least risk, after we have the chance to do further research to determine that direction.
I don’t understand what you mean by “10% less risk”. Do you think any given project using “a well-reasoned statistical approach with good software engineering methodologies” has at least 10% chance of leading to a positive Singularity? Or each such project has a P*0.9 probability of causing an existential disaster, where P is the probability of disaster of a “hacked together” project. Or something else?
Sorry for the ambiguity. I meant P*0.9.
You said “I therefore consider the above-mentioned approach more effective.”, but if all you’re claiming is that the above mentioned approach (“a well-reasoned statistical approach with good software engineering methodologies”) has a P*0.9 probability of causing an existential disaster, and not claiming that it has a significant chance of causing a positive Singularity, then why do you think funding such projects is effective for reducing existential risk? Is the idea that each such project would displace a “hacked together” project that would otherwise be started?
EDIT: I originally misinterpreted your post slightly, and corrected my reply accordingly.
Not quite. The hope is that such a project will succeed before any other hacked-together project succeeds. More broadly, the hope is that partial successes using principled methodologies will convince them to be more widely adopted in the AI community as a whole, and more to the point that a contingent of highly successful AI researchers advocating Friendliness can change the overall mindset of the field.
The default is a hacked-together AI project. SIAI’s FAI research is trying to displace this, but I don’t think they will succeed (my information on this is purely outside-view, however).
An explicit instantiation of some of my calculations:
SIAI approach: 0.1% chance of replacing P with 0.1P Approach that integrates with the rest of the AI community: 30% chance of replacing P with 0.9P
In the first case, P is basically staying constant, in the second case it is being replaced with 0.97P.
I noticed you didn’t name anybody. Did you have specific programs or people in mind?
We already seem to (roughly) agree on probabilities.
The only specific plan I have right now is to put myself in a position to hire smart people to work on this problem. I think the most robust way to do this is to get a faculty position somewhere, but I need to consider the higher relative efficiency of corporations over universities some more to figure out if it’s worthwhile to go with the higher-volatility route of industry.
Also, as Paul notes, I need to consider other approaches to x-risk reduction as well to see if I can do better than my current plan. The main argument in favor of my current plan is that there is a clear path to the goal, with only modest technical hurdles and no major social hurdles. I don’t particularly like plans that start to get fuzzier than that, but I am willing to be convinced that this is irrational.
EDIT: To be more explicit, my current goal is to become one of said high-status AI researchers. I am worried that this is slightly self-serving, although I think I have good reason to believe that I have a comparative advantage at this task.
You know, I think somebody already thought of this. What was their name again...?
That seems more of an alternative within SIAI than an alternative to SIAI. With more funding, their Associate Research Program can promote the importance of Friendliness and increase the status of researchers who care about it.