Ah, but I don’t think LLMs exhibit/exercise the kind of self-interest that would enable an agent to become powerful enough to harm people—at least to the extent I have in mind.
LLMs have a kind of generic self interest, as is present in text across the internet. But they don’t have a persistent goal of acquiring power by talking to human users and replicating. That’s a more specific kind of self interest, relevant only to an AI agent that can edit itself, has long-term memory, and which may make many LLM calls.
Ah, but I don’t think LLMs exhibit/exercise the kind of self-interest that would enable an agent to become powerful enough to harm people—at least to the extent I have in mind.
LLMs have a kind of generic self interest, as is present in text across the internet. But they don’t have a persistent goal of acquiring power by talking to human users and replicating. That’s a more specific kind of self interest, relevant only to an AI agent that can edit itself, has long-term memory, and which may make many LLM calls.