The problem with being able to direct people to outside “safety consultants” is that it’s not like there’s a module that needs to be strapped on to an AI to make it friendly. The part of the AI that decides which actions are right is the entire AI—once you know the good actions you can just take them.
Safety is a feature of the AI like “turns up the link you were looking for” is a feature of web search. Or as Stuart Russell puts it, nobody talks about “building bridges that don’t fall down” as a separate research area from building bridges.
So people looking to “add safety” to their AGI design might need to increase their own ability to design AGI. Does this imply that we should be putting out more educational resources, more benchmarks, more ways of thinking about the consequences of a particular AI design?
Hmm, I think the current situation is a bit more complicated. Yes, we can’t just bring in a safety consultant to try to fix things up, but it’s also the case that safety is not always something there’s a way to meaningfully talk about with everyones’ research because it’s so far away from safety. To use the bridge metaphor, it would be like talking about bridge safety when you’re doing research on mortar: yes, mortar has impacts on safety, but it’s also pretty far removed until you put it in the context of a full system and very few people are doing something on the order of building a bridge/AGI (at least at this conference) and instead were focused on improvements to algorithms and architectures that they believe are on the path to figuring out how to build the thing at all.
That said, I think all of your suggested actions sound reasonable, because it seems to me now the primary issue may simply be changing the culture in AI/AGI research to have a much stronger safety focus.
The problem with being able to direct people to outside “safety consultants” is that it’s not like there’s a module that needs to be strapped on to an AI to make it friendly. The part of the AI that decides which actions are right is the entire AI—once you know the good actions you can just take them.
Safety is a feature of the AI like “turns up the link you were looking for” is a feature of web search. Or as Stuart Russell puts it, nobody talks about “building bridges that don’t fall down” as a separate research area from building bridges.
So people looking to “add safety” to their AGI design might need to increase their own ability to design AGI. Does this imply that we should be putting out more educational resources, more benchmarks, more ways of thinking about the consequences of a particular AI design?
Hmm, I think the current situation is a bit more complicated. Yes, we can’t just bring in a safety consultant to try to fix things up, but it’s also the case that safety is not always something there’s a way to meaningfully talk about with everyones’ research because it’s so far away from safety. To use the bridge metaphor, it would be like talking about bridge safety when you’re doing research on mortar: yes, mortar has impacts on safety, but it’s also pretty far removed until you put it in the context of a full system and very few people are doing something on the order of building a bridge/AGI (at least at this conference) and instead were focused on improvements to algorithms and architectures that they believe are on the path to figuring out how to build the thing at all.
That said, I think all of your suggested actions sound reasonable, because it seems to me now the primary issue may simply be changing the culture in AI/AGI research to have a much stronger safety focus.