As simulator theory makes clear, a base model is a random generator, per query, of members of your category 2. I view instruction & safety training that to generate a pretty consistent member of category 1, or 3 as inherently hard — especially 1, since it’s a larger change. My guess would thus be that the personality of Claude 3.5 is closer to your category 3 than 1 (modulo philosophical questions about whether there is any meaningful difference, e.g. for ethical purposes, between “actually having” an emotion versus just successfully simulating the output of the same token stream as a person who has an emotion).
On your categories:
As simulator theory makes clear, a base model is a random generator, per query, of members of your category 2. I view instruction & safety training that to generate a pretty consistent member of category 1, or 3 as inherently hard — especially 1, since it’s a larger change. My guess would thus be that the personality of Claude 3.5 is closer to your category 3 than 1 (modulo philosophical questions about whether there is any meaningful difference, e.g. for ethical purposes, between “actually having” an emotion versus just successfully simulating the output of the same token stream as a person who has an emotion).