For context: the linked post exposes a well-designed survey of experts about the intelligence and coherence of various entities. The answers show a clear coherence-intelligence anti-correlation. The questions they ask the experts are:
Intelligence:
“How intelligent is this entity? (This question is about capability. It is explicitly not about competence. To the extent possible do not consider how effective the entity is at utilizing its intelligence.)”
Coherence:
“This is one question, but I’m going to phrase it a few different ways, in the hopes it reduces ambiguity in what I’m trying to ask: How well can the entity’s behavior be explained as trying to optimize a single fixed utility function? How well aligned is the entity’s behavior with a coherent and self-consistent set of goals? To what degree is the entity not a hot mess of self-undermining behavior? (for machine learning models, consider the behavior of the model on downstream tasks, not when the model is being trained)”
Of course there’s the problem of what are peoples’ judgements of “coherence” measuring. In considering possible ways of making the definition more clear, the post says:
For machine learning models within a single domain, we could use robustness of performance to small changes in task specification, training random seed, or other aspects of the problem specification. For living things (including humans) and organizations, we could first identify limiting resources for their life cycle. For living things these might be things like time, food, sunlight, water, or fixed nitrogen. For organizations, they could be headcount, money, or time. We could then estimate the fraction of that limiting resource expended on activities not directly linked to survival+reproduction, or to an organization’s mission. This fraction is a measure of incoherence.
It seems to me the kind of measure proposed for machine learning systems is at odds with the one for living beings. For ML, it’s “robustness to environmental changes”. For animals, it’s “spending all resources on survival”. For organizations, “spending all resources on the stated mission”. By the for-ML definition, humans, I’d say, win: they are the best entity at adapting, whatever their goal. By the for-animals definition, humans would lose completely. So these are strongly inconsistent definitions. I think the problem is fixing the goal a priori: you don’t get to ask “what is the entity pursuing, actually?”, but proclaim “the entity is pursuing survival and reproduction”, “the organization is pursuing what it says on paper”. Even though they are only speculative definitions, not used in the survey, I think they are evidence of confusion in the mind of who wrote them, and potentially in the survey respondents (alternative hypothesis: sloppiness, “survival+reproduction” was intended for most animals but not humans).
So, what did the experts read in the question?
“How well can the entity’s behavior be explained as trying to optimize a single fixed utility function? How well aligned is the entity’s behavior with a coherent and self-consistent set of goals? To what degree is the entity not a hot mess of self-undermining behavior?”
Take two entities at opposite ends in the figure: the “single ant” (judged most coherent) and a human (judged least coherent).
..............
SINGLE ANT vs. HUMAN
How well can your behavior be explained as trying to optimize a single fixed utility function?
ANT: A great heap, sir! I have a simple and clear utility function! Feed my mother the queen!
HUMAN: Wait, wait, wait. I bet you would stop feeding your queen as soon as I put you somewhere else. It’s not utility, it’s just learned patterns of behavior.
ANT: Ohi, that’s not valid sir! That’s cheating! You can do that just because you are more intelligent and powerful. An what would be your utility function, dare I ask?
HUMAN: Well, uhm, I value many things. Happiness, but sometimes also going through adversity; love; good food… I don’t know how to state my utility function. I just know that I happen to want things, and when I do, you sure can describe me as actually trying to get them, not just “doing the usual, and, you know, stuff happens”.
ANT: You are again conflating coherence with power! Truth is, many things make you powerless, like many things make me! You are big in front of me, but small in front of the universe! If I had more power, I’d be very, very good at feeding the queen!
HUMAN: As I see it, it’s you who’s conflating coherence with complexity. I’m complex, and I also happen to have a complex utility. If I set myself to a goal, I can do it even if it’s “against my nature”. I’m retargetable. I can be compactly described as goals separate from capabilities. If you magically became stronger and more intelligent, I bet you would be very, very bent on making tracks, duper gung-oh on touching other ants with your antennas in weird patterns you like, and so on. You would not get creative about it. Your supposed “utility” would shatter.
ANT: So you said yourself that if I became as intelligent as you, I’d shatter my utility, and so appear less coherent, like you are! Checkmate human!
HUMAN: Aaargh, no, you are looking at it all wrong. You would not be like me. I can recognize in myself all the patterns of shattered goals, all my shards, but I can also see beyond that. I can transcend. You, unevolved ant, magically scaled in some not well defined brute-force just-zooming sense, would be left with nothing in your ming but the small-ant shards, and insist on them.
ANT: What’s with that “not well defined etc.” nonsense? You don’t actually know! For all you know about how this works, scaling my mind could make me get bent on feeding the queen, not just “amplify” my current behaviors!
HUMAN: And conceding that possibility, would you not be more coherent then?
ANT: No way! I would be as coherent as now, just more intelligent!
HUMAN: Whatevs.
How well aligned is your behavior with a coherent and self-consistent set of goals?
ANT: I’m super-self-consistent! I don’t care about anything but queen-feeding! I’ll happily sacrifice myself to that end! Actually, I’d not even let myself die happily, I’d die caring-for-the-queen-ly!
HUMAN: Uff, I bet my position will be misunderstood again, but anyway: I don’t know how to compactly specify my goals, I internally perceive my value as many separate pieces, so I can’t say I’m consistent in my value-seeking with a straight face. However, I’m positive that I can decide to suppress any of my value-pieces to get more whole-value, even suppress all of my value-pieces at once. This proves there’s a single consistent something I value. I just don’t know how to summarize or communicate what it is.
ANT: “That” “proves” you “value” the heck what? That proves you don’t just have many inconsistent goals, you even come equipped with inconsistent meta-goals!
HUMAN: To know what that proves, you have to look at my behavior, and my success at achieving goals I set myself to. In the few cases where I make a public precommitment, you have nice clear evidence I can ignore a lot of immediate desires for something else. That’s evidence for my mind-system doing that overall, even if I can’t specify a single, unique goal for everything I ever do at once.
ANT: If your “proof” works, then it works for me too! I surely try to avoid dying in general, yet I’ll die for the queen! Very inconsistent subgoals, very clear global goal! You’re at net disadvantage because you can not specify your goal, ant-human 2-1!
HUMAN: This is an artefact of you not being an actual ant but a rhetorical “ANT” implemented by a human. You are even more simple than a real ant, yet contained in something much larger and self-reflective. As a real ant, I expect you would have both a more complicated global goal that what appears by saying “feed the queen”, and that you would not be able to self-reflect on the totality of it.
ANT: Sophistry! You are still recognizing the greater simplicity of the real-me goal, which makes me more consistent!
HUMAN: We always come to that. I’m more complex, not less consistent.
To what degree are you not a hot mess of self-undermining behavior?
ANT: No cycles wasted, a single track, a single anthill, a single queen, that’s your favorite ant’s jingle!
HUMAN: Funny but no. Your inter-ants communications are totally inefficient. You waste tons of time wandering almost randomly, touching the other ants here and there, to get the emergent swarm behavior. I expect nanotechnology in principle could make you able to communicate via radio. We humans invented tech to make inter-humans communications efficient to pursue our goals, you can’t, your behaviors are undermining.
ANT: All my allowed behaviors are not undermining! My mind is perfect, my body is flawed! Your mind undermines itself from the inside!
HUMAN: The question says “behaviors”, which I’d interpret as outward actions, but let’s concede the interpretation as internal behaviors of the mind. I know it’s speculative, but again, I expect real-ant to have a less clean mind-state than you make it appear, in proportion to its behavioral complexity.
ANT: No comment, apart from underlining “speculative”! Since you admitted “suppressing your goals” before, isn’t that “undermining” at it fullest?
HUMAN: You said that of yourself too.
ANT: But you seemed to imply you have a lot more of these goals-to-suppress!
HUMAN: Again: my values are more complex, and your simplicity is in part an artefact.
............
The cruxes I see in the ant-human comparison are:
we reflect on ourselves, while we do not perceive the ant as ants;
our value is more complex, and our intelligence allows us to do more complicated things to get it.
I think the experts mostly saw “behavioral simplicity” and “simply stated goals” into the question, but not the “adaptability in pursuing whatever it’s doing” proposed later for ML systems. I’d argue instead that something being a “goal” instead of a “behavior” is captured by there being many different paths taking to it, and coherence is about preferring things in some order and so modifying your behavior to that end, rather than having a prefixed simple plan.
I can’t see how to clearly disentangle complexity, coherence, intelligence. Right now I’m confused enough that I would not even know what to think if someone from the future told me “yup, science confirms humans are definitely more/less coherent than ants”.
I don’t understand what is the “discount factor” to apply when deciding how coherent is a more complex entity.
… an entity with more complex values.
… an entity with more available actions.
… an entity that makes more complicated plans.
What would be the implication of this complexity-discounted coherence notion, anyway? Do I want some “raw” coherence measure instead to understand what an entity does?
For context: the linked post exposes a well-designed survey of experts about the intelligence and coherence of various entities. The answers show a clear coherence-intelligence anti-correlation. The questions they ask the experts are:
Intelligence:
Coherence:
Of course there’s the problem of what are peoples’ judgements of “coherence” measuring. In considering possible ways of making the definition more clear, the post says:
It seems to me the kind of measure proposed for machine learning systems is at odds with the one for living beings. For ML, it’s “robustness to environmental changes”. For animals, it’s “spending all resources on survival”. For organizations, “spending all resources on the stated mission”. By the for-ML definition, humans, I’d say, win: they are the best entity at adapting, whatever their goal. By the for-animals definition, humans would lose completely. So these are strongly inconsistent definitions. I think the problem is fixing the goal a priori: you don’t get to ask “what is the entity pursuing, actually?”, but proclaim “the entity is pursuing survival and reproduction”, “the organization is pursuing what it says on paper”. Even though they are only speculative definitions, not used in the survey, I think they are evidence of confusion in the mind of who wrote them, and potentially in the survey respondents (alternative hypothesis: sloppiness, “survival+reproduction” was intended for most animals but not humans).
So, what did the experts read in the question?
Take two entities at opposite ends in the figure: the “single ant” (judged most coherent) and a human (judged least coherent).
..............
SINGLE ANT vs. HUMAN
ANT: A great heap, sir! I have a simple and clear utility function! Feed my mother the queen!
HUMAN: Wait, wait, wait. I bet you would stop feeding your queen as soon as I put you somewhere else. It’s not utility, it’s just learned patterns of behavior.
ANT: Ohi, that’s not valid sir! That’s cheating! You can do that just because you are more intelligent and powerful. An what would be your utility function, dare I ask?
HUMAN: Well, uhm, I value many things. Happiness, but sometimes also going through adversity; love; good food… I don’t know how to state my utility function. I just know that I happen to want things, and when I do, you sure can describe me as actually trying to get them, not just “doing the usual, and, you know, stuff happens”.
ANT: You are again conflating coherence with power! Truth is, many things make you powerless, like many things make me! You are big in front of me, but small in front of the universe! If I had more power, I’d be very, very good at feeding the queen!
HUMAN: As I see it, it’s you who’s conflating coherence with complexity. I’m complex, and I also happen to have a complex utility. If I set myself to a goal, I can do it even if it’s “against my nature”. I’m retargetable. I can be compactly described as goals separate from capabilities. If you magically became stronger and more intelligent, I bet you would be very, very bent on making tracks, duper gung-oh on touching other ants with your antennas in weird patterns you like, and so on. You would not get creative about it. Your supposed “utility” would shatter.
ANT: So you said yourself that if I became as intelligent as you, I’d shatter my utility, and so appear less coherent, like you are! Checkmate human!
HUMAN: Aaargh, no, you are looking at it all wrong. You would not be like me. I can recognize in myself all the patterns of shattered goals, all my shards, but I can also see beyond that. I can transcend. You, unevolved ant, magically scaled in some not well defined brute-force just-zooming sense, would be left with nothing in your ming but the small-ant shards, and insist on them.
ANT: What’s with that “not well defined etc.” nonsense? You don’t actually know! For all you know about how this works, scaling my mind could make me get bent on feeding the queen, not just “amplify” my current behaviors!
HUMAN: And conceding that possibility, would you not be more coherent then?
ANT: No way! I would be as coherent as now, just more intelligent!
HUMAN: Whatevs.
ANT: I’m super-self-consistent! I don’t care about anything but queen-feeding! I’ll happily sacrifice myself to that end! Actually, I’d not even let myself die happily, I’d die caring-for-the-queen-ly!
HUMAN: Uff, I bet my position will be misunderstood again, but anyway: I don’t know how to compactly specify my goals, I internally perceive my value as many separate pieces, so I can’t say I’m consistent in my value-seeking with a straight face. However, I’m positive that I can decide to suppress any of my value-pieces to get more whole-value, even suppress all of my value-pieces at once. This proves there’s a single consistent something I value. I just don’t know how to summarize or communicate what it is.
ANT: “That” “proves” you “value” the heck what? That proves you don’t just have many inconsistent goals, you even come equipped with inconsistent meta-goals!
HUMAN: To know what that proves, you have to look at my behavior, and my success at achieving goals I set myself to. In the few cases where I make a public precommitment, you have nice clear evidence I can ignore a lot of immediate desires for something else. That’s evidence for my mind-system doing that overall, even if I can’t specify a single, unique goal for everything I ever do at once.
ANT: If your “proof” works, then it works for me too! I surely try to avoid dying in general, yet I’ll die for the queen! Very inconsistent subgoals, very clear global goal! You’re at net disadvantage because you can not specify your goal, ant-human 2-1!
HUMAN: This is an artefact of you not being an actual ant but a rhetorical “ANT” implemented by a human. You are even more simple than a real ant, yet contained in something much larger and self-reflective. As a real ant, I expect you would have both a more complicated global goal that what appears by saying “feed the queen”, and that you would not be able to self-reflect on the totality of it.
ANT: Sophistry! You are still recognizing the greater simplicity of the real-me goal, which makes me more consistent!
HUMAN: We always come to that. I’m more complex, not less consistent.
ANT: No cycles wasted, a single track, a single anthill, a single queen, that’s your favorite ant’s jingle!
HUMAN: Funny but no. Your inter-ants communications are totally inefficient. You waste tons of time wandering almost randomly, touching the other ants here and there, to get the emergent swarm behavior. I expect nanotechnology in principle could make you able to communicate via radio. We humans invented tech to make inter-humans communications efficient to pursue our goals, you can’t, your behaviors are undermining.
ANT: All my allowed behaviors are not undermining! My mind is perfect, my body is flawed! Your mind undermines itself from the inside!
HUMAN: The question says “behaviors”, which I’d interpret as outward actions, but let’s concede the interpretation as internal behaviors of the mind. I know it’s speculative, but again, I expect real-ant to have a less clean mind-state than you make it appear, in proportion to its behavioral complexity.
ANT: No comment, apart from underlining “speculative”! Since you admitted “suppressing your goals” before, isn’t that “undermining” at it fullest?
HUMAN: You said that of yourself too.
ANT: But you seemed to imply you have a lot more of these goals-to-suppress!
HUMAN: Again: my values are more complex, and your simplicity is in part an artefact.
............
The cruxes I see in the ant-human comparison are:
we reflect on ourselves, while we do not perceive the ant as ants;
our value is more complex, and our intelligence allows us to do more complicated things to get it.
I think the experts mostly saw “behavioral simplicity” and “simply stated goals” into the question, but not the “adaptability in pursuing whatever it’s doing” proposed later for ML systems. I’d argue instead that something being a “goal” instead of a “behavior” is captured by there being many different paths taking to it, and coherence is about preferring things in some order and so modifying your behavior to that end, rather than having a prefixed simple plan.
I can’t see how to clearly disentangle complexity, coherence, intelligence. Right now I’m confused enough that I would not even know what to think if someone from the future told me “yup, science confirms humans are definitely more/less coherent than ants”.
I don’t understand what is the “discount factor” to apply when deciding how coherent is a more complex entity.
… an entity with more complex values.
… an entity with more available actions.
… an entity that makes more complicated plans.
What would be the implication of this complexity-discounted coherence notion, anyway? Do I want some “raw” coherence measure instead to understand what an entity does?