In a similar conversation about non-main-actor paths to dangerous AI I came up with this as an example of a path I can imagine being plausible and dangerous: A plausible-to-me worst case scenario would be something like: A phone-scam organization employs someone to build them a online-learning reinforcement learning agent (using an open-source language model as a language-understanding-component) that functions as a scam-helper. It takes in the live transcription of the ongoing conversation between a scammer and a victim, and gives the scammer suggestions for what to say next to persuade the victim to send money. So long as it was even a bit helpful sometimes according to the team of scammers using it, more resources would be given to it and it would continue to collect useful data.
I think this scenario contains a number of dangerous aspects: being illegal and secret, not subject to ethical or safety guidance or regulation deliberately being designed to open-endedly self-improve bringing in incremental resources as it trains to continue to prove its worth (thus not needing a huge initial investment of training cost)
being agentive and directed at the specific goal of manipulating and deceiving humans
I don’t think we need 10 more years of progress in algorithms and compute for this story to be technologically feasible. A crude version of this is possibly already in use, and we wouldn’t know.
In a similar conversation about non-main-actor paths to dangerous AI I came up with this as an example of a path I can imagine being plausible and dangerous: A plausible-to-me worst case scenario would be something like:
A phone-scam organization employs someone to build them a online-learning reinforcement learning agent (using an open-source language model as a language-understanding-component) that functions as a scam-helper. It takes in the live transcription of the ongoing conversation between a scammer and a victim, and gives the scammer suggestions for what to say next to persuade the victim to send money. So long as it was even a bit helpful sometimes according to the team of scammers using it, more resources would be given to it and it would continue to collect useful data.
I think this scenario contains a number of dangerous aspects:
being illegal and secret, not subject to ethical or safety guidance or regulation
deliberately being designed to open-endedly self-improve
bringing in incremental resources as it trains to continue to prove its worth (thus not needing a huge initial investment of training cost)
being agentive and directed at the specific goal of manipulating and deceiving humans
I don’t think we need 10 more years of progress in algorithms and compute for this story to be technologically feasible. A crude version of this is possibly already in use, and we wouldn’t know.