Software engineering, parenting, cognition, meditation, other
Linkedin, Facebook, Admonymous (anonymous feedback)
Gunnar_Zarncke
What about infants who haven’t formed such representations or patients with severe impairment in minimally conscious states who can no longer form such representations?
Wait, that doesn’t compute. Why would they become moral patients because other people care about them? That fails on two accounts at once:
The family members do not feel pain about their loved ones. I agree that they suffer, but that is not related to pain stimuli. You can have aversive feelings toward all kinds of things unrelated to nociception. Just think about salty water. You only crave it if you have too little salt, but otherwise it is yuck. Although, maybe, you mean nociception in a non-standard way.
Even if the family member’s aversiveness were sufficient it would prove too much: It would make basically any object that people care about and feel suffering when damaged or lost a moral patient.
But I like the self-representation aspect of your criterion and I think it could be fixed by reducing it to just that:
Any system that represents its own aversive responses, deserves moral patienthood.
It would require to make “represent reponse” very precise, but I think that would be possible.
Do I understand correctly that, according to your criteria, people with pain asymbolia would not count as moral patients (assuming they literally never experience suffering or aversiveness to nociceptive stimuli)?
After some more thought, I agree even more. A large part of management is an ad-hoc solution to human alignment. And as I predict agents to be unreliable as long as technical alignment is unsolved, more management by humans will be needed. Still, productivity may increase a lot.
LLMs are even more bottlenecked on management than human organizations are, and therefore LLMs will be less useful than human organizations in practice for most use cases.
People will instead mostly continue to rely on human employees, because human employees need less management.
These seem like great predictions worth checking. Can you make them more specific (time, likelihood)?
Did you mean these Levels of Consciousness? I think these a descriptive (and to some degree prescriptive) but not explanatory. They don’t say how these layers arise except as a developmental process, but that just pushes the explanation elsewhere.
I predict that—consciousness will turn out to be explainable with manageable complexity (by 2040, 90%) - the explanation will match observed behavior and allow decent predictions (like, which diseases or drugs will have an effect on it) - many people will dispute this and come up with edge cases and/or put on extra demands on what consciousness is supposed to be (but that wouldn’t change the predictivity of the theory of course) - the theory will allow engineers to build systems that are conscious in a recognizable way (by 2050, 85%) - many people will dispute this and claim those are zombies—some of the big systems will be ruled moral persons by at least a few courts (60%) - the engineers with optimize the systems allow smaller and smaller systems to be conscious in this sense to the point where they do little else beside being conscious (70%) - people will do all kinds of crazy stuff with this, maybe embed minimal such systems in devices to prevent turning off.
Of course, at least in the context of startups, the success of the startups will be correlated, for multiple reasons, partly selection effects (selected by the same funders), partly network effects (if they are together in a batch, they will benefit (or harm) each other).
I still owe you a response to this. I’m esp. thinking about predictions.
the model’s internal representations of behavior and linguistic self-description are already aligned.
But that is arguably also the case for humans. Human behaviors are more complex and embedded, though. And the embedding seems crucial as it allows self-observation.
Sure, there are differences between countries and people. Not all things that cells can do can be done by people in a corresponding way in a country, and vice versa. The individual units of a country—people—are much more mobile than the cells in a person. They can even change hosts. I think this is related to the coherence that Dogan mentioned. The coherence of countries is lower than that of persons. On the other hand, countries exist for longer (but “think” slower).
I love Hofstadter, but remember his anthill is fiction, and one shouldn’t use it as evidence for anything.
Yeah, I’m not happy that the anthill is fictional. I considered putting it into a footnote, but then I would have to put all the table entires there too, and the comparison in a table would be lost, and I think it helps drive the intuition that the elements of computation could be distributed.
Though I suspect my main sticking point is “entity” rather than “conscious”. There’s a missing element of coherence that I think matters quite a bit. LLMs are missing coherence-over-time and coherence-across-executions.
I agree with that. In fact, it is one reason I don’t see LLMs currently as conscious. An earlier version of this post had a combined system of an LLM and a human interacting with it as another example, but I felt that was too difficult and not core to the thesis. A human, by continuously interacting, can provide the coherence-over-time. Stable awareness patterns and self-perception might still be missing or weak, though.
Countries are missing coherence-between-subsets.
Yes—and I think that’s the most fragile part of the analogy. There is coherence, but it’s definitely not as robust as a nervous system is. Still, we do see subsets (e.g., ministries, branches of government, political blocs) coordinating through shared norms, procedures, and mutual modelling. They’re noisy, error-prone, often adversarial, but they’re not completely incoherent. At times, especially under external threat or during major events, countries do behave in surprisingly unified ways. These aren’t mere aggregations of individual actions; they require and ensure a degree of coordination that maintain a whole over time.
When you say “countries do X”, it’s always the case that actually, some numbers of individual humans do it, and other numbers either don’t participate or don’t stop it
If we take that critique seriously, we have to stop saying that corporations launch products, or that teams win matches. There’s always an underlying substrate of individual action. But we regularly model higher-level entities as agents when doing so improves prediction or explanation. From a functionalist perspective, if “Country X believes Y” helps us model diplomatic behaviour more accurately than tracking all individuals, that’s meaningful—even if we know that it is an abstraction.
Countries do NOT state their right to exist. Humans state their right to be collectively recognized as a country.
Yes, but I think this is too strict a reading. The same could be said about any distributed system. When a program outputs “Hello world,” it’s really just electrons doing things. When a person speaks, it’s really just muscles and neural impulses. The distinction is in the coordination and interpretation. When a state department issues a formal diplomatic communication, it’s acting as the voice of an institution that maintains internal models, makes predictions, and responds to feedback. That is, in all the functional ways that matter, it is the country speaking.
There are almost no muscle groups that act coherently without a brain to help coordinate.
Exactly, and we can extend the analogy to institutions that are the coordinating organs of a country’s body. They can fail, conflict, or contradict each other, which is comparable to a neurological disorder. But that doesn’t mean there is no coherence. It just means the coherence is partial and susceptible to breakdown. One could say that is also true of human consciousness in pathological states.
So yes, I take the point that coherence is crucial. But I don’t think the lack of perfect coherence disqualifies countries from being modelled as agents or even from being on some continuum toward consciousness. The better question might be: Under what conditions does it become useful or predictive to model a system as being conscious?
Unexpected Conscious Entities
The last thing may result from a hard-coded genetic heuristic learning rate. We can’t update fully Bayesian and a learning rate is an approximation given computational constraints. There is an optimal learning rate, but it depends on context, such as the trust in prior information, esp. the volatility of the environment. And thus it may happen that your genetic prior for your learning rate may not match the dynamics of your current environment. I guess our modern environment changes faster than the ancestral environment and most people update to slowly on new information. Updating much faster is probably adaptive. I also have that.
Hm, yes, seems plausible. Very inconsistent though. And they should remove the second paragraph, which seems to imply that it is still possible to apply anyway.
Can somebody get me in touch with somebody from the Center for AI Safety (safe.ai)? Their page for applying for compute resources seems broken. I have used their contact form to report the issue on April 7th, but received no reply.
This is how the application page looks like at least since then (linked from their Compute Cluster page):
As you can see, there is no form field to enter and only a lone “Absenden” button, which is German and means “submit” (which is strange because my system and browser are set to English). If I click that button, I get this message:
Looks like this form is empty. Try filling it out before submitting.
My guess is that there is a problem with their Airtable integration.
If you wonder what I’m trying to apply for:
The project Reducing LLM deception at scale with self-other overlap fine-tuning (SOO) I am working with at AE Studio is in urgent need for more compute to run SOO experiments with Mistral Large 2 (or even larger).
The aintelope project (sorry, not many updates recently) needs compute for running more evaluations of our benchmark From homeostasis to resource sharing: Biologically and economically aligned multi-objective multi-agent AI safety benchmarks and we wanted to apply at CAIS too (having run out of funding, more on that later).
It is a great idea to test a hypothesis experimentally. I did your experiment too, and the result is:
hours in a day: when I saw your post it was 1 AM in the morning, estimating 2 hours in a day. ❌
months in a year: I’m born in Juni, so twelve months. ✅ Though, we could also have taken the current month as a base and then it would have been 8 months.
Earth size: I don’t know latitude but probably like yours—I’m in Hamburg. ✅ But I do know that the longitude here is pretty exactly 10. If I go by that the circumference should be 20 - instead of 360. ❌
human life expectancy: I’m 51. ✅
Several experiments show that I can extract useful information just by treating myself as a random sample, and thus a view that I can’t use myself as a random sample is false.
I think there are some problems here. I think be more accurate claim would be:
You can do experiments that extract useful information about whether you can treat yourself as a random sample (i.e., a representative or “typical” sample) by comparing the result of the experiment to the baserate.
Or at the very least, based on my experiments, for me, the claim seems to be false. I’m not representative enough. But I can’t know that without comparing my results to a baserate. I can’t use the observations to establish a baserate or make estimations such as expected lifetime.
From a statistical perspective, a random sample means:
Drawn randomly from the population of interest—but you are not randomly selected.
Having an equal and independent chance of being selected—but you are subject to bias.
The sample size is sufficient to capture variance—but you are n=1, thus variance is undefined.
You may not be representative in any observable or unobservable dimension for your purpose. And to know if you are representative, you have to look at other samples and then you are back so some kind of baserate.
Outside view, context, and details. I’d ask
How big is the fish?
How much did the fish cost?
How big is the aquarium?
What is the natural habitat of the fish and what kind of species is it?
I don’t understand how the rewarding works. Can you explain again?
How far away is the TV? Is it in the water?
How long did it take to train?
How often was it wrong on the politicians?
Have you shown anybody else?
Is this a person I know or somebody I know knows?
See also Proper posture for mental arts, which also mentions the Unbendable Arm and explains how it works biomechanically, namely via the latissimus dorsi.
I understand this to mean that we care not only about current moral patients but also about potential ones, such as the infants and impaired patients. That would be consistent with people caring about embryos, but it would also match seeds and eggs of all kinds, which seem less clearly matching intuitions.