On the surface level, it feels like an approach with a low probability of success. Simply put, the reason is that building CoEm is harder than building any AGI.
I consider it to be harder not only because it is not what everyone already does but also because it seems to be similar to AI people tried to create before deep learning and it didn’t work at all until they decided to switch to Magic which [comparatively] worked amazingly.
Some people are still trying to do something along the lines (e.g. Ben Goertzel) but I haven’t seen anything working at least remotely comparable with deep learning yet.
I think that the gap between (1) “having some AGI which is very helpful in solving alignment” and (2) “having very dangerous AGI” is probably quite small.
It seems very unlikely that CoEm will be the first system to reach (1), so probably it is going to be some other system. Now, we can either try to solve alignment using this system or wait until CoEm is improved enough so it reaches (1). Intuitively, it feels like we will go from (1) to (2) much faster than we will be able to improve CoEm enough.
So overall I am quite sceptical but I think it still can be the best idea if all other ideas are even worse. I think that more obvious ideas like “trying to understand how Magic works” (interoperability) and “trying to control Magic without understanding” (things like Constitutional AI etc.) are somewhat more promising, but there are a lot of efforts in this direction, so maybe somebody should try something else. Unfortunately, it is extremely hard to judge if it’s actually the case.
On the surface level, it feels like an approach with a low probability of success. Simply put, the reason is that building CoEm is harder than building any AGI.
I consider it to be harder not only because it is not what everyone already does but also because it seems to be similar to AI people tried to create before deep learning and it didn’t work at all until they decided to switch to Magic which [comparatively] worked amazingly.
Some people are still trying to do something along the lines (e.g. Ben Goertzel) but I haven’t seen anything working at least remotely comparable with deep learning yet.
I think that the gap between (1) “having some AGI which is very helpful in solving alignment” and (2) “having very dangerous AGI” is probably quite small.
It seems very unlikely that CoEm will be the first system to reach (1), so probably it is going to be some other system. Now, we can either try to solve alignment using this system or wait until CoEm is improved enough so it reaches (1). Intuitively, it feels like we will go from (1) to (2) much faster than we will be able to improve CoEm enough.
So overall I am quite sceptical but I think it still can be the best idea if all other ideas are even worse. I think that more obvious ideas like “trying to understand how Magic works” (interoperability) and “trying to control Magic without understanding” (things like Constitutional AI etc.) are somewhat more promising, but there are a lot of efforts in this direction, so maybe somebody should try something else. Unfortunately, it is extremely hard to judge if it’s actually the case.