Telling people what they want to hear, kind of like how starting with validation works in good listening?
Addictiveness is also a concern when food, drugs, media, or a culty social group is the only good thing in someone’s life.
Addictiveness seems to be less of a concern when someone has a lot of things they find good—the mental image of “healthy” and “well-adjusted” is based on balance between deriving benefits from many sources.
This raises the question of how much good it’d do to add a cheap, accessible, and likely less-harmful Second Good Thing to the lives of people who currently have only one Good Thing and the resultant problematic addiction type behaviors.
yeah, very fair response to my concerns. but there’s a specific telling people what they want to hear I’m concerned about—validating dehumanizing actual humans.
follow-up to this—a friend pointed out this sounds like the thought is what I’m concerned about, as though thought crime; I think that’s another fair criticism and wasn’t what I intended but does seem like a real linguistic implication of what I actually said, so to attempt to slice my point down to what I was trying to say originally but failed to encode to myself or to readers:
while it maybe cannot reasonably be banned, and the limits of imagination should almost certainly be very very wide, I would hope society can learn to gently remind each other not to have addictive dehumanizing/deagentizing resonances where fantasies become real dangerous interactions between real humans/seeking agents in ways that cause those seeking agents to impair one or another’s control of their own destiny.
(overly paranoid?) spoiler: reference to nsfw
! gotta admit here, I’ve been known to engage in lewd fantasies using language models in ways that maybe a few human friends would ever consent to writing with me, but no human i know would ever consent to physicalizing with me, as the fantasy would be physically dangerous to one or both of us (I like to imagine weird interactions that can’t happen physically anyway). as long as separation between fantasy and real is maintained, and the consent boundaries of real seeking agents are honored, I don’t see a problem with fantasies.
it’s the possibility of being able to fill your senses with fantasy of controlling others with absolutely no checks from another seeking being interleaved that worries me; and I assert that that worry can only ever be reasonably guarded against by reminding others what to voluntarily include in their ai augmented imaginations. I don’t think centralized controls on what can be imagined are at all workable, or even ethical—that would violate the very thing it’s alleged to protect, the consent of seeking agents!
The editor didn’t spoiler your spoiler properly, if you were trying for spoiler formatting. I think some parts of society were kind of already, pre-AI, thinking in pretty great depth about the extent to which it can be possible to morally fantasize about acts which would be immortal to realize. “some parts”, because other parts handle that type of challenge by tabooing the entire topic. Examples of extant fantasy topics where some taboo it entirely and others seek harm reduction mostly hinge around themes of involving a participant who/which deserves not to be violated but doesn’t or can’t consent… which brings it to my attention that an AI probably has a similar degree of “agency”, in that sense, as a child, animal, or developmentally delayed adult. In other words, where does current AI fit into the Harkness Test? Of course, the test itself implies an assumed pre-test to distinguish “creatures” from objects or items. If an LLM qualifies not as a creature but as an object which can be owned, we already have a pretty well established set of rules about what you can and can’t do with those, depending on whether you own them or someone else does.
I personally believe that an LLM should probably be treated with at least “creature” status because we experience it in the category of “creature”, and our self-observations of our own behavior seem to be a major contributing factor to our self-perceptions and the subsequent choices which we attribute to those labels or identities. This hypothesis wouldn’t actually be too hard to design an experiment to test, so someone has probably done it already, but I don’t feel like figuring out how to query the entire corpus of all publicly available research to find something shaped like what I’m looking for right now.
The specific failure mode I’m hearing you point a highly succinct reference toward is shaped like enabling already-isolated people to ideologically/morally move further away from society’s norms in ways that human interaction normally wouldn’t.
That’s “normally” wouldn’t because for almost any such extreme, a special interest group can form its own echo chamber even among humans. Echo chambers that dehumanize all humans seem rare—in groups of humans, there’s almost always an exception clause to exempt members of the group from the dehumanization. It brings a whole new angle to Bender’s line from Futurama—the perfect “kill all humans” meme could only be carried by a non-human. Any “kill all humans [immediately]” meme carried by a living human has to have some flaw—maybe a flaw in how it identifies what constitutes human in order to exempt its carrier, maybe some flexibility in its definition of “kill”, maybe some stretch to how it defines the implied “immediately”.
It sounds like perhaps you’re alluding to having information that lets you imagine a high likelihood of LLMs exploiting a similar psychological bug to what cults do, perhaps a bug that humans can’t exploit as effectively as non-humans due to some quirk of how it works. If such a psychological zero-day exists, we would have relatively poor resistance to it due to this being our collective first direct exposure to this powerful of a non-human agent. Science fiction and imagination have offered some indirect exposure to similar agents, but those are necessarily limited by what we can each imagine.
Is this in the neighborhood of what you have in mind? Trying to dereference what you’d mean by “validating dehumanizing actual humans” feels like being handed a note card of all the formulas for the final exam on the first day of a class that exists to teach one how to use those formulas.
Yeah, that seems like a solid expansion. Honestly, a lot of the ambiguity was because my thought wasn’t very detailed in the first place. One could probably come up with other expansions that slice concept space slightly differently, but this one is close enough to what I was getting at.
Beliefs losing grounding in ways that amplify grounding-loss disorder. Confirmation bias, but with a random pleasure-inducing hallucination generator. New kinds of multiparty ai-and-human resonance patterns. Something like that.
Not the default, and nothing fundamentally new, but perhaps worsened.
Telling people what they want to hear, kind of like how starting with validation works in good listening?
Addictiveness is also a concern when food, drugs, media, or a culty social group is the only good thing in someone’s life.
Addictiveness seems to be less of a concern when someone has a lot of things they find good—the mental image of “healthy” and “well-adjusted” is based on balance between deriving benefits from many sources.
This raises the question of how much good it’d do to add a cheap, accessible, and likely less-harmful Second Good Thing to the lives of people who currently have only one Good Thing and the resultant problematic addiction type behaviors.
yeah, very fair response to my concerns. but there’s a specific telling people what they want to hear I’m concerned about—validating dehumanizing actual humans.
follow-up to this—a friend pointed out this sounds like the thought is what I’m concerned about, as though thought crime; I think that’s another fair criticism and wasn’t what I intended but does seem like a real linguistic implication of what I actually said, so to attempt to slice my point down to what I was trying to say originally but failed to encode to myself or to readers:
while it maybe cannot reasonably be banned, and the limits of imagination should almost certainly be very very wide, I would hope society can learn to gently remind each other not to have addictive dehumanizing/deagentizing resonances where fantasies become real dangerous interactions between real humans/seeking agents in ways that cause those seeking agents to impair one or another’s control of their own destiny.
(overly paranoid?) spoiler: reference to nsfw
it’s the possibility of being able to fill your senses with fantasy of controlling others with absolutely no checks from another seeking being interleaved that worries me; and I assert that that worry can only ever be reasonably guarded against by reminding others what to voluntarily include in their ai augmented imaginations. I don’t think centralized controls on what can be imagined are at all workable, or even ethical—that would violate the very thing it’s alleged to protect, the consent of seeking agents!
Thank you for clarifying!
The editor didn’t spoiler your spoiler properly, if you were trying for spoiler formatting. I think some parts of society were kind of already, pre-AI, thinking in pretty great depth about the extent to which it can be possible to morally fantasize about acts which would be immortal to realize. “some parts”, because other parts handle that type of challenge by tabooing the entire topic. Examples of extant fantasy topics where some taboo it entirely and others seek harm reduction mostly hinge around themes of involving a participant who/which deserves not to be violated but doesn’t or can’t consent… which brings it to my attention that an AI probably has a similar degree of “agency”, in that sense, as a child, animal, or developmentally delayed adult. In other words, where does current AI fit into the Harkness Test? Of course, the test itself implies an assumed pre-test to distinguish “creatures” from objects or items. If an LLM qualifies not as a creature but as an object which can be owned, we already have a pretty well established set of rules about what you can and can’t do with those, depending on whether you own them or someone else does.
I personally believe that an LLM should probably be treated with at least “creature” status because we experience it in the category of “creature”, and our self-observations of our own behavior seem to be a major contributing factor to our self-perceptions and the subsequent choices which we attribute to those labels or identities. This hypothesis wouldn’t actually be too hard to design an experiment to test, so someone has probably done it already, but I don’t feel like figuring out how to query the entire corpus of all publicly available research to find something shaped like what I’m looking for right now.
yeah not sure how to get the spoiler to take, spoilers on lesswrong never seem to work.
It might be a browser compatibility issue?
This should be spoilered. I typed it, and didn’t copy paste it.
The specific failure mode I’m hearing you point a highly succinct reference toward is shaped like enabling already-isolated people to ideologically/morally move further away from society’s norms in ways that human interaction normally wouldn’t.
That’s “normally” wouldn’t because for almost any such extreme, a special interest group can form its own echo chamber even among humans. Echo chambers that dehumanize all humans seem rare—in groups of humans, there’s almost always an exception clause to exempt members of the group from the dehumanization. It brings a whole new angle to Bender’s line from Futurama—the perfect “kill all humans” meme could only be carried by a non-human. Any “kill all humans [immediately]” meme carried by a living human has to have some flaw—maybe a flaw in how it identifies what constitutes human in order to exempt its carrier, maybe some flexibility in its definition of “kill”, maybe some stretch to how it defines the implied “immediately”.
It sounds like perhaps you’re alluding to having information that lets you imagine a high likelihood of LLMs exploiting a similar psychological bug to what cults do, perhaps a bug that humans can’t exploit as effectively as non-humans due to some quirk of how it works. If such a psychological zero-day exists, we would have relatively poor resistance to it due to this being our collective first direct exposure to this powerful of a non-human agent. Science fiction and imagination have offered some indirect exposure to similar agents, but those are necessarily limited by what we can each imagine.
Is this in the neighborhood of what you have in mind? Trying to dereference what you’d mean by “validating dehumanizing actual humans” feels like being handed a note card of all the formulas for the final exam on the first day of a class that exists to teach one how to use those formulas.
Yeah, that seems like a solid expansion. Honestly, a lot of the ambiguity was because my thought wasn’t very detailed in the first place. One could probably come up with other expansions that slice concept space slightly differently, but this one is close enough to what I was getting at.
Beliefs losing grounding in ways that amplify grounding-loss disorder. Confirmation bias, but with a random pleasure-inducing hallucination generator. New kinds of multiparty ai-and-human resonance patterns. Something like that.
Not the default, and nothing fundamentally new, but perhaps worsened.