Yup. As someone aiming to do their dissertation on issues of limited agency (low impact, mild optimization, corrigibility), it sure would be frustrating to essentially end up duplicating the insights that MIRI has on some new optimization paradigm.
I still understand why they’re doing this and think it’s possibly beneficial, but it would be nice to avoid having this happen.
You can find a number of interesting engineering practices at NASA. They do things like take three independent teams, give each of them the same engineering spec, and tell them to design the same software system; and then they choose between implementations by majority vote. The system that they actually deploy consults all three systems when making a choice, and if the three systems disagree, the choice is made by majority vote. The idea is that any one implementation will have bugs, but it’s unlikely all three implementations will have a bug in the same place.
One could make an argument for multiple indepedent AI safety teams on similar grounds:
“any one optimization paradigm may have weaknesses, but it’s unlikely that multiple optimization paradigms will have weaknesses in the same place”
“any one team may not consider a particular fatal flaw, but it’s unlikely that multiple teams will all neglect the same fatal flaw”
In the best case, you can merge multiple paradigms & preserve the strengths of all with the weaknesses of none. In the worst case, having competing paradigms still gives you the opportunity to select the best one.
This works best if individual researchers/research teams are able to set aside their egos and overcome not-invented-here syndrome to create the best overall system… which is a big if.
Yup. As someone aiming to do their dissertation on issues of limited agency (low impact, mild optimization, corrigibility), it sure would be frustrating to essentially end up duplicating the insights that MIRI has on some new optimization paradigm.
I still understand why they’re doing this and think it’s possibly beneficial, but it would be nice to avoid having this happen.
From this post:
One could make an argument for multiple indepedent AI safety teams on similar grounds:
“any one optimization paradigm may have weaknesses, but it’s unlikely that multiple optimization paradigms will have weaknesses in the same place”
“any one team may not consider a particular fatal flaw, but it’s unlikely that multiple teams will all neglect the same fatal flaw”
In the best case, you can merge multiple paradigms & preserve the strengths of all with the weaknesses of none. In the worst case, having competing paradigms still gives you the opportunity to select the best one.
This works best if individual researchers/research teams are able to set aside their egos and overcome not-invented-here syndrome to create the best overall system… which is a big if.