I probably should have included the original Twitter thread that sparked the article link in which the author says bluntly that she will no longer discuss AI consciousness/superintelligence. Those two had become conflated, so thanks for pointing that out!
With regards to instrumental convergence (just browsed the Arbitral page), are you saying the big names working on AI safety are now more focused on incidental catastrophic harms caused by a superintelligence on its way to achieve goals, rather than making sure artificial intelligence will understand and care about human values?
Somebody else might be able to answer better than me. I don’t know exactly what each researcher is working on right now.
“AI safety are now more focused on incidental catastrophic harms caused by a superintelligence on its way to achieve goals”
Basically, yes. The fear isn’t that AI will wipe out humanity because someone gave it the goal ‘kill all humans’.
For a huge number of innocent sounding goals ‘incapacitate all humans and other AIs’ is a really sensible precaution to take if all you care about is getting your chances of failure down to zero. As is hiding the fact that you intend to do harm until the very last moment.
“rather than making sure artificial intelligence will understand and care about human values?”
If you solved that then presumably the first bit solves itself. So they’re definitely linked.
From my beginners understanding, the two objects you are comparing are not mutually exclusive.
There is currently work being done on inner alignment and outer alignment, where inner alignment is more focused on making sure that an AI doesn’t coincidentally optimize humanity out of existence due to [us not teaching it a clear enough version of/it misinterpreting] our goals and outer alignment more focused on making sure we have goals aligned to human values we should teach it.
Different big names focus on different parts/subparts of the above (with crossover as well).
I probably should have included the original Twitter thread that sparked the article link in which the author says bluntly that she will no longer discuss AI consciousness/superintelligence. Those two had become conflated, so thanks for pointing that out!
With regards to instrumental convergence (just browsed the Arbitral page), are you saying the big names working on AI safety are now more focused on incidental catastrophic harms caused by a superintelligence on its way to achieve goals, rather than making sure artificial intelligence will understand and care about human values?
Somebody else might be able to answer better than me. I don’t know exactly what each researcher is working on right now.
“AI safety are now more focused on incidental catastrophic harms caused by a superintelligence on its way to achieve goals”
Basically, yes. The fear isn’t that AI will wipe out humanity because someone gave it the goal ‘kill all humans’.
For a huge number of innocent sounding goals ‘incapacitate all humans and other AIs’ is a really sensible precaution to take if all you care about is getting your chances of failure down to zero. As is hiding the fact that you intend to do harm until the very last moment.
“rather than making sure artificial intelligence will understand and care about human values?”
If you solved that then presumably the first bit solves itself. So they’re definitely linked.
From my beginners understanding, the two objects you are comparing are not mutually exclusive.
There is currently work being done on inner alignment and outer alignment, where inner alignment is more focused on making sure that an AI doesn’t coincidentally optimize humanity out of existence due to [us not teaching it a clear enough version of/it misinterpreting] our goals and outer alignment more focused on making sure we have goals aligned to human values we should teach it.
Different big names focus on different parts/subparts of the above (with crossover as well).