Yes, agreed. And: the inevitable dual-use nature of alignment research shouldn’t be used as a reason or excuse to not do alignment research. Remaining pure by not contributing to AGI is going to be cold comfort if it arrives without good alignment plans.
So there is a huge challenge: we need to do alignment research that contributes more to alignment than to capabilities, within the current social arrangement. It’s very tough to guess which this is. Optimistic individuals will be biased to do too much; cautious individuals will tend to do too little.
One possible direction is to put more effort toward AGI alignment strategy. My perception is that we have far less than would be optimal, and that this is the standard nature of science. People are more apt to work on object level projects rather than strategize about which will best advance the field’s goals. We have much more than average in AGI safety, but the large disagreements among well-informed and conscientious thinkers indicates that we could still use more. And it seems like such strategic thinking is necessary to decide which object-level work is will probably advance alignment faster than capabilities.
Yes, agreed. And: the inevitable dual-use nature of alignment research shouldn’t be used as a reason or excuse to not do alignment research. Remaining pure by not contributing to AGI is going to be cold comfort if it arrives without good alignment plans.
So there is a huge challenge: we need to do alignment research that contributes more to alignment than to capabilities, within the current social arrangement. It’s very tough to guess which this is. Optimistic individuals will be biased to do too much; cautious individuals will tend to do too little.
One possible direction is to put more effort toward AGI alignment strategy. My perception is that we have far less than would be optimal, and that this is the standard nature of science. People are more apt to work on object level projects rather than strategize about which will best advance the field’s goals. We have much more than average in AGI safety, but the large disagreements among well-informed and conscientious thinkers indicates that we could still use more. And it seems like such strategic thinking is necessary to decide which object-level work is will probably advance alignment faster than capabilities.