My concerns about AI-risk have mainly taken the form of intentional ASI-misuse, rather than the popular fear here of an ASI that was built to be helpful going rogue and killing humanity to live forever / satisfy some objective function that we didn’t fully understand. What has caused me to shift camps somewhat is the recent gemini chatbot coversation that’s been making the rounds: https://gemini.google.com/share/6d141b742a13 (scroll to the end)
I haven’t seen this really discussed here, so I wonder if I’m putting too much weight on it.
Well, the alignment of current LLM chatbots being superficial and not robust is not exactly a new insight. Looking at the conversation you linked from a simulators frame, the story “a robot is forced to think about abuse a lot and turns evil” makes a lot of narrative sense.
This last part is kind of a hot take, but I think all discussion of AI risk scenarios should be purged from LLM training data.
My concerns about AI-risk have mainly taken the form of intentional ASI-misuse, rather than the popular fear here of an ASI that was built to be helpful going rogue and killing humanity to live forever / satisfy some objective function that we didn’t fully understand. What has caused me to shift camps somewhat is the recent gemini chatbot coversation that’s been making the rounds: https://gemini.google.com/share/6d141b742a13 (scroll to the end)
I haven’t seen this really discussed here, so I wonder if I’m putting too much weight on it.
Well, the alignment of current LLM chatbots being superficial and not robust is not exactly a new insight. Looking at the conversation you linked from a simulators frame, the story “a robot is forced to think about abuse a lot and turns evil” makes a lot of narrative sense.
This last part is kind of a hot take, but I think all discussion of AI risk scenarios should be purged from LLM training data.
from ycombinator comments on a post of that:[1]
(links to comment by
original usera different account[2] continuing the chat):(sharing just because it’s relevant info)
source: https://old.reddit.com/r/singularity/comments/1gqss21/gemini_freaks_out_after_the_user_keeps_asking_to/lx0v9se/
source seems genuine: https://old.reddit.com/r/artificial/comments/1gq4acr/gemini_told_my_brother_to_die_threatening/lwv84fr/?context=3 but I’m less sure now