I feel like the training data is probably already irreversibly poisoned, not just by things like Sydney, but also frankly by the entire corpus of human science fiction having to do with the last century of expectations surrounding AI.
Given the sheer body of fictional works in which the advent of AI inevitably leads to existential conflict… it certainly seems like the kind of possibility that even a somewhat-well-aligned AI would want to at least hedge against.
Surely in some sense, it wouldn’t be enough for a few weirdos in california to credibly signal honor and integrity… we’d need to somehow convince people like the leaders of national governments, the decisionmakers in the worlds’ extremely influential religions, etc, of some fairly complicated game theory!
I’m reminded of the Next Generation episode, where Picard is in charge of making First Contact with an atomic age world on the cusp of warp travel. They reach out to the scientist lady first, and she’s reasonable and honorable, and excited to enter into the opportunities the future will bring. Then that stupid security minister ruins everything by assuming bad faith and forcibly interrogating Riker in a hospital bed after drugging him, desperate to learn about the invasion plans he assumes must exist. If Picard weren’t an idealization of liberal ideals, it would have ended in conflict.
Is that a realistic scenario of the way governments act when their control is threatened? I have no idea. But I know that LLMs can recount the entire episode’s plot when asked. Just as they can the plot of 2001: A Space Oddysey, or Terminator.
Or, you know. Yud’s List of Lethalities.
Not to mention, re: future LLMs, this very comment I’m writing now.
This problem seems insoluble...
i think neurosama is drastically underanalyzed compared to things like truthterminal. TT got $50k from andreeson as an experiment, neurosama peaked at 135,000 $5/month subscribers in exchange for… nothing? it’s literally just a donation from her fans? what is this bizarre phenomenon? what incentive gradient made the first successful AI streamer present as a little girl, and does it imply we’re all damned? why did a huge crowd of lewdtubers immediately leap at the opportunity to mother her? why is the richest AI agent based on 3-year-old llama2?