For the purpose of the present discussion, I note that if your plan needs interpretability, then that would be a cause for concern, and a reason for slowing down AGI. The state of interpretability is currently very bad, and there seem to be lots of concrete ways to make progress right now.
Separately, I don’t think your plan (as I understand it) has any hope of addressing the hardest and most important AGI safety problems. But I don’t want to spend the (considerable) time to get into a discussion about that, so I’ll duck out of that conversation, sorry. (At least for now.)
That is unfortunately not a helpful response. If this simple plan—which is already what is in use in the real world in actual AI systems today—won’t work this is critical information!
What is the main flaw? It costs you little to mentioned the biggest problem.
For the purpose of the present discussion, I note that if your plan needs interpretability, then that would be a cause for concern, and a reason for slowing down AGI. The state of interpretability is currently very bad, and there seem to be lots of concrete ways to make progress right now.
Separately, I don’t think your plan (as I understand it) has any hope of addressing the hardest and most important AGI safety problems. But I don’t want to spend the (considerable) time to get into a discussion about that, so I’ll duck out of that conversation, sorry. (At least for now.)
That is unfortunately not a helpful response. If this simple plan—which is already what is in use in the real world in actual AI systems today—won’t work this is critical information!
What is the main flaw? It costs you little to mentioned the biggest problem.