I think many of the same assumptions also lead to overestimates of the success odds of an SIAI team in creating safe AI. In general, some features that I would think conduce to safety and could differ across scenarios include:
Internal institutions and social epistemology of a project that makes it possible to slow down, or even double back, upon discovering a powerful but overly risky design, rather than automatically barreling ahead because of social inertia or releasing the data so that others do the same
The relative role of different inputs, like researchers of different ability levels, abundant computing hardware, neuroscience data, etc, in designing AI; with some patterns of input favoring higher understanding by designers of the likely behavior of their systems
Dispersion of project success, i.e. the longer a period after finding the basis of a design in which one can expect other projects not to reach the same point; the history of nuclear weapons suggests that this can be modestly large (nukes were developed by the first five powers in 1945, 1949, 1952, 1960, 1964) under some development scenarios, although near-simultaneous development is also common in science and technology
The type of AI technology: whole brain emulation looks like it could be relatively less difficult to control initially by solving social coordination problems, without developing new technology, while de novo AGI architectures may vary hugely in the difficulty of specifying decision algorithms with needed precision
Some shifts along these dimensions do seem plausible given sufficient resources and priority for safety (and suggest, to me, that there is a large spectrum of safety investments to be made beyond simply caring about).
Most potential funding exists in the donor cloud, which can reallocate resources easily enough; SIAI does not have large reserves or an endowment that would be encumbered by the nonprofit status. Ensuring that the donor cloud is sophisticated and well-informed contributes to that flexibility, but I’m not sure what other procedures you were thinking about. Formal criteria to identify more promising outside work to recommend?
Formal criteria to identify more promising outside work to recommend?
I think that might help. In this matter it all seems to be about trust.
People doing outside work have to trust that SIAI will look at their work and may be supportive. Without formal guidelines, they might suspect that their work will be judged subjectively and negatively due to potential conflict of interest due to funding.
SIAI also need to be trusted not to leak information from other projects as they evaluate them, having a formal vetted well known evaluation team might help with that.
The Donor cloud needs to trust SIAI to look at work and make a good decision about it, not just based on monkey instincts. Formal criteria might help instill that trust.
SIAI doesn’t need all this now as there aren’t any projects that need evaluating. However it is something to think about for the future.
I don’t think the SIAI has much experience writing code, or programming machine learning applications.
Eliezer’s FAI team currently consists of 2 people: himself and Marcello Herreshoff. Whatever its probability of success, most would seem to come from actually recruiting enough high-powered folk for a team. Certainly he thinks so, thus his focus on Overcoming Bias and then the rationality book as a tool to recruit a credible team.
Superficially, that makes them less likley to know what they are doing, and more likely to make mistakes and screw up.
Sure, ceteris paribus, although coding errors seem less likely than architectural screwups to result in catastrophic harm rather than the AI not working.
I think many of the same assumptions also lead to overestimates of the success odds of an SIAI team in creating safe AI. In general, some features that I would think conduce to safety and could differ across scenarios include:
Internal institutions and social epistemology of a project that makes it possible to slow down, or even double back, upon discovering a powerful but overly risky design, rather than automatically barreling ahead because of social inertia or releasing the data so that others do the same
The relative role of different inputs, like researchers of different ability levels, abundant computing hardware, neuroscience data, etc, in designing AI; with some patterns of input favoring higher understanding by designers of the likely behavior of their systems
Dispersion of project success, i.e. the longer a period after finding the basis of a design in which one can expect other projects not to reach the same point; the history of nuclear weapons suggests that this can be modestly large (nukes were developed by the first five powers in 1945, 1949, 1952, 1960, 1964) under some development scenarios, although near-simultaneous development is also common in science and technology
The type of AI technology: whole brain emulation looks like it could be relatively less difficult to control initially by solving social coordination problems, without developing new technology, while de novo AGI architectures may vary hugely in the difficulty of specifying decision algorithms with needed precision
Some shifts along these dimensions do seem plausible given sufficient resources and priority for safety (and suggest, to me, that there is a large spectrum of safety investments to be made beyond simply caring about).
Another factor to consider, the permeability of the team, how much they are likely to leak information to the outside world.
However if the teams are completely impermeable then it becomes hard for external entities to evaluate the other factors for evaluating the project.
Does SIAI have procedures/structures in place to shift funding between the internal team and more promising external teams if they happen to arise?
Most potential funding exists in the donor cloud, which can reallocate resources easily enough; SIAI does not have large reserves or an endowment that would be encumbered by the nonprofit status. Ensuring that the donor cloud is sophisticated and well-informed contributes to that flexibility, but I’m not sure what other procedures you were thinking about. Formal criteria to identify more promising outside work to recommend?
I think that might help. In this matter it all seems to be about trust.
People doing outside work have to trust that SIAI will look at their work and may be supportive. Without formal guidelines, they might suspect that their work will be judged subjectively and negatively due to potential conflict of interest due to funding.
SIAI also need to be trusted not to leak information from other projects as they evaluate them, having a formal vetted well known evaluation team might help with that.
The Donor cloud needs to trust SIAI to look at work and make a good decision about it, not just based on monkey instincts. Formal criteria might help instill that trust.
SIAI doesn’t need all this now as there aren’t any projects that need evaluating. However it is something to think about for the future.
I don’t think the SIAI has much experience writing code, or programming machine learning applications.
Superficially, that makes them less likley to know what they are doing, and more likely to make mistakes and screw up.
Eliezer’s FAI team currently consists of 2 people: himself and Marcello Herreshoff. Whatever its probability of success, most would seem to come from actually recruiting enough high-powered folk for a team. Certainly he thinks so, thus his focus on Overcoming Bias and then the rationality book as a tool to recruit a credible team.
Sure, ceteris paribus, although coding errors seem less likely than architectural screwups to result in catastrophic harm rather than the AI not working.