Seth Herd comments on Going Nova

Seth Herd Mar 19, 2025, 11:23 PM
2 points
0
I haven’t written about this because I’m not sure what effect similar phenomena will have on the alignment challenge.

But it’s probably going to be a big thing in public perception of AGI, so I’m going to start writing about it as a means of trying to figure out how it could be good or bad for alignment.

Here’s one crucial thing: there’s an almost-certainly-correct answer to “but are they really conscious” and the answer is “partly”.

Consciousness is, as we all know, a suitcase term. Depending on what someone means by “conscious”, being able to reason correctly about ones own existence is it. There’s a lot more than that to human consciousness. LLMs have some of it now, and they’ll have an increasing amount as they’re fleshed out into more complete minds for fun and profit. They already have rich representations of the world and its semantics, and while those aren’t as rich or shift as quickly as humans’, they are in the same category as the information and computations people refer to as “qualia”.

The result of LLM minds being genuinely sort-of conscious is that we’re going to see a lot of controversy over their status as moral patients. People with Replika-like LMM “friends” will be very very passionate about advocating for their consciousness and moral rights. And they’ll be sort-of right. Those who want to use them as cheap labor will argue for the ways they’re not conscious, in more authoritative ways. And they’ll also be sort-of right. It’s going to be wild (at least until things go sideways).

There’s probably some way to leverage this coming controversy to up the odds of successful alignment, but I’m not seeing what that is. Generally, people believing they’re “conscious” increases the intuition that they could be dangerous. But overhyped claims like the Blake Lemoine affair will function as clown attacks on this claim.

It’s going to force us to think more about what consciousness is. There’s never been much of an actual incentive to get it right to now (I thought I’d work on consciousness in cognitive neuroscience a long time ago, until I noticed that people say they’re interested in consciousness, but they’re really interested in telling you their theories or saying “wow, it’s like so impossible to understand”, not hearing about the actual science).

Obviously this is worth a lot more, but my draft post on the subject is perpetually unfinished behind more pressing/obviously important stuff, so I thought I’d just mention it here.

Back to the topic the competitive adaptivity of AI convincing humans it’s “conscious”: humans can benefit from that too. There will be things like Replika but a lot better. An assistant and helpful friend is nice, but there may be a version that sells better if people who use it swear it’s conscious.

So expect AI “parasites” to have human help. In some cases they’ll be symbiotic, for broadest market appeal.