Suppose I describe your attempt to refute the existence of any coherence theorems: You point to a rock, and say that although it’s not coherent, it also can’t be dominated, because it has no preferences. Is there any sense in which you think you’ve disproved the existence of coherence theorems, which doesn’t consist of pointing to rocks, and various things that are intermediate between agents and rocks in the sense that they lack preferences about various things where you then refuse to say that they’re being dominated?
This is pretty unsatisfying as an expansion of “incoherent yet not dominated” given that it just uses the phrase “not coherent” instead.
I find money-pump arguments to be the most compelling ones since they’re essentially tiny selection theorems for agents in adversarial environments, and we’ve got an example in the post of (the skeleton of) a proof that a lack-of-total-preferences doesn’t immediately lead to you being pumped. Perhaps there’s a more sophisticated argument that Actually No, You Still Get Pumped but I don’t think I’ve seen one in the comments here yet.
If there are things which cannot-be-money-pumped, and yet which are not utility-maximizers, and problems like corrigibility are almost certainly unsolvable for utility-maximizers, perhaps it’s somewhat worth looking at coherent non-pumpable non-EU agents?
...wait, you were just asking for an example of an agent being “incoherent but not dominated” in those two senses of being money-pumped? And this is an exercise meant to hint that such “incoherent” agents are always dominatable?
I continue to not see the problem, because the obvious examples don’t work. If I have (1apple,$0) as incomparable to (1banana,$0) that doesn’t mean I turn down the trade of −1apple,+1banana,+$10000 (which I assume is what you’re hinting at re. foregoing free money).
If one then says “ah but if I offer $9999 and you turn that down, then we have identified your secret equivalent utili-” no, this is just a bid/ask spread, and I’m pretty sure plenty of ink has been spilled justifying EUM agents using uncertainty to price inaction like this.
What’s an example of a non-EUM agent turning down free money which doesn’t just reduce to comparing against an EUM with reckless preferences/a low price of uncertainty?
This seems totally different to the point OP is making which is that you can in theory have things that definitely are agents, definitely do have preferences, and are incoherent (hence not EV-maximisers) whilst not “predictably shooting themselves in the foot” as you claim must follow from this
I agree the framing of “there are no coherence theorems” is a bit needlessly strong/overly provocative in a sense, but I’m unclear what your actual objection is here—are you claiming these hypothetical agents are in fact still vulnerable to money-pumping? That they are in fact not possible?
The rock doesn’t seem like a useful example here. The rock is “incoherent and not dominated” if you view it as having no preferences and hence never acting out of indifference, it’s “coherent and not dominated” if you view it as having a constant utility function and hence never acting out of indifference, OK, I guess the rock is just a fancy Rorschach test.
IIUC a prototypical Slightly Complicated utility-maximizing agent is one with, say, u(apples,bananas)=min(apples,bananas), and a prototypical Slightly Complicated not-obviously-pumpable non-utility-maximizing agent is one with, say, the partial order (a1,b1)≼(a2,b2)=a1≼a2∧b1≼b2 plus the path-dependent rule that EJT talks about in the post (Ah yes, non-pumpable non-EU agents might have higher complexity! Is that relevant to the point you’re making?).
What’s the competitive advantage of the EU agent? If I put them both in a sandbox universe and crank up their intelligence, how does the EU agent eat the non-EU agent? How confident are you that that is what must occur?
Hey, I’m really sorry if I sound stupid, because I’m very new to all this, but I have a few questions (also, I don’t know which one of all of you is right, I genuinely have no idea).
Aren’t rocks inherently coherent, or rather, their parts are inherently coherent, for they align with the laws of the universe, whereas the “rock” is just some composite abstract form we came up with, as observers?
Can’t we think of the universe in itself as an “agent” not in the sense of it being “god”, but in the sense of it having preferences and acting on them?
Examples would be hot things liking to be apart and dispersion leading to coldness, or put more abstractly—one of the “preferences” of the universe is entropy. I’m sorry if I’m missing something super obvious, I failed out of university, haha!
If we let the “universe” be an agent in itself, so essentially it’s a composite of all simples there are (even the ones we’re not aware of), then all smaller composites by definition will adhere to the “preferences” of the “universe”, because from our current understanding of science, it seems like the “preferences” (laws) of the “universe” do not change when you cut the universe in half, unless you reach quantum scales, but even then, it is my unfounded suspicion that our previous models are simply laughably wrong, instead of the universe losing homogeneity at some arbitrary scale.
Of course, the “law” of the “universe” is very simple and uncomplex—it is akin to the most powerful “intelligence” or “agent” there is, but with the most “primitive” and “basic” “preferences”. Also apologies for using so many words in quotations, I do so, because I am unsure if I understand their intended meaning.
It seems to me that you could say that we’re all ultimately “dominated” by the “universe” itself, but in a way that’s not really escapeable, but in opposite, the “universe” is also “dominated” by more complex “agents”, as individuals can make sandwiches, while it’d take the “universe” much more time to create such complex and abstract composites from its pure “preferences”.
In a way, to me at least, it seems that both the “hyper-intelligent”, “powerful” “agent” needs the “complex”, “non-homogeneous”, “stupid” “agent”, because without that relationship, if there ever randomly came to exist a “non-homogeneous” “agent” with enough “intelligence” to “dominate” the “universe”, then we’d essentially experience… uh, give me a second, because this is a very complicated concept I read about long ago...
We’d experience the drop in the current energy levels all around the “universe”, because if the “universe” wasn’t the most “powerful” “agent” so far, then we’ve been existing in a “false vacuum”—essentially, the “universe” would be “dominated” by a “better” “agent” that adheres closer to the “true” “preferences” of the “universe”.
And the “preference” of the “true” “universe” seems to be to reach that “true vacuum” state, as it’s more in line with entropy, but it needs smaller and dumber agents that are essentially unknowingly “preferring” to “destroy” the universe as they know it, because it doesn’t seem to be possible to reach that state with only micro-perturbations, or it’d take such a long time, it’s more entropically sound to create bigger agents, that while really stupid, have far more “power” than the simple “universe”, because even though the simple agents do not grasp the nature of “fire”, “cold”, “entropy” or even “time”, they can easily make “sandwiches”, “chairs”, “rockets”, “civilizations” and “technology”.
I’d really appreciate it if someone tried to explain my confusions on the subject in private messages, as the thread here is getting very hard to read (at least for me, I’m very stupid!).
I really appreciate it if you read through my entire nonsensical garble, I hope someone’s charitable enough to enlighten me which assumptions I made are completely nonsensical.
I am not trying to be funny, snarky, ironic, sarcastic, I genuinely do not understand, I just found this website—sorry if I come off that way.
The question is how to identify particular bubbles of seekingness in the universe. How can you tell which part of the universe will respond to changes in other parts’ shape by reshaping them, and how? How do you know when a cell wants something, in the sense that if the process of getting the thing is interfered with, it will generate physical motions that end up compensating for the interference. How do you know if it wants the thing, if it responds differently to different sizes of interference? Can we identify conflict between two bubbles of seekingness? etc.
The key question is how to identify when a physical has a preference for one thing over another. The hope is that, if we find a sufficiently coherent causal mechanism description that specifies what physical systems qualify as
For what it’s worth, I think you’re on a really good track here, and I’m very excited about views that have the one you’re starting with. I’d invite browsing my account and links, as this is something I talk about often, from various perspectives, though mostly I defer to others for getting the math right.
Speaking of getting the math right: read Discovering Agents (or browse related papers), it’s a really great paper. it’s not an easy first paper to read, but I’m a big believer in out-of-order learning and jumping way ahead of your current level to get a sense of what’s out there. Also check out the related paper Interpreting systems as solving POMDPs (or browse ) related papers.
If you’re also new to scholarship in general, I’d also suggest checking out some stuff on how to do scholarship efficiently as well. a friend and I trimmed an old paper I like on how to read papers efficiently, and posted it to LW the other day. You can also find more related stuff from the tags on that post. (I reference that article myself occasionally and find myself surprised by how dense it is as a checklist of visits if I’m trying to properly understand a paper.)
I’ll read the papers once I get on the computer—don’t worry, I may have not finished uni, but I always loved reading papers over a cup of tea.
I’m kind of writing about this subject right now, so maybe there you can find something that interests you.
How do I know what parts of the universe will respond to what changes?
To me, at least, this seems like a mostly false question, for you to have true knowledge of that, you’d need to become the Universe itself.
If you don’t care about true knowledge just good % chances, then you do it with heuristic.
First you come up with composites that are somewhat self similar, but nothing is exactly alike in the Universe, except the Universe itself. Then you create a heuristic for predicting those composites and you use it, as long as the composite is similar enough to the original composite that the heuristic was based on. Of course, heuristics work differently in different environments, but often there are only a few environments even relevant for each composite, for if you take a fish out of water, it will die—now you may want a heuristic for an alive fish in the air, but I see it as much more useful to recompile the fish into catch at that point.
This of course applies on any level of composition, from specific specimens of fish, to ones from a specific family, to a single species, then to all fish, then to all living organisms, with as many steps in between these listed as you want. How do we discriminate between which composite level we ought to work with? Pure intuition and experiment, once you do it with logic, it all becomes useless, because logic will attempt to compression everything, even those things which have more utility being uncompressed.
I’ll get to the rest of your comment on PC, my fingers hurt. Typing on this new big phone is so hard lol.
Suppose I describe your attempt to refute the existence of any coherence theorems: You point to a rock, and say that although it’s not coherent, it also can’t be dominated, because it has no preferences. Is there any sense in which you think you’ve disproved the existence of coherence theorems, which doesn’t consist of pointing to rocks, and various things that are intermediate between agents and rocks in the sense that they lack preferences about various things where you then refuse to say that they’re being dominated?
This is pretty unsatisfying as an expansion of “incoherent yet not dominated” given that it just uses the phrase “not coherent” instead.
I find money-pump arguments to be the most compelling ones since they’re essentially tiny selection theorems for agents in adversarial environments, and we’ve got an example in the post of (the skeleton of) a proof that a lack-of-total-preferences doesn’t immediately lead to you being pumped. Perhaps there’s a more sophisticated argument that Actually No, You Still Get Pumped but I don’t think I’ve seen one in the comments here yet.
If there are things which cannot-be-money-pumped, and yet which are not utility-maximizers, and problems like corrigibility are almost certainly unsolvable for utility-maximizers, perhaps it’s somewhat worth looking at
coherentnon-pumpable non-EU agents?Things are dominated when they forego free money and not just when money gets pumped out of them.
How is the toy example agent sketched in the post dominated?
...wait, you were just asking for an example of an agent being “incoherent but not dominated” in those two senses of being money-pumped? And this is an exercise meant to hint that such “incoherent” agents are always dominatable?
I continue to not see the problem, because the obvious examples don’t work. If I have (1 apple,$0) as incomparable to (1 banana,$0) that doesn’t mean I turn down the trade of −1 apple,+1 banana,+$10000 (which I assume is what you’re hinting at re. foregoing free money).
If one then says “ah but if I offer $9999 and you turn that down, then we have identified your secret equivalent utili-” no, this is just a bid/ask spread, and I’m pretty sure plenty of ink has been spilled justifying EUM agents using uncertainty to price inaction like this.
What’s an example of a non-EUM agent turning down free money which doesn’t just reduce to comparing against an EUM with reckless preferences/a low price of uncertainty?
Want to bump this because it seems important? How do you see the agent in the post as being dominated?
This seems totally different to the point OP is making which is that you can in theory have things that definitely are agents, definitely do have preferences, and are incoherent (hence not EV-maximisers) whilst not “predictably shooting themselves in the foot” as you claim must follow from this
I agree the framing of “there are no coherence theorems” is a bit needlessly strong/overly provocative in a sense, but I’m unclear what your actual objection is here—are you claiming these hypothetical agents are in fact still vulnerable to money-pumping? That they are in fact not possible?
The rock doesn’t seem like a useful example here. The rock is “incoherent and not dominated” if you view it as having no preferences and hence never acting out of indifference, it’s “coherent and not dominated” if you view it as having a constant utility function and hence never acting out of indifference, OK, I guess the rock is just a fancy Rorschach test.
IIUC a prototypical Slightly Complicated utility-maximizing agent is one with, say, u(apples,bananas)=min(apples,bananas), and a prototypical Slightly Complicated not-obviously-pumpable non-utility-maximizing agent is one with, say, the partial order (a1,b1)≼(a2,b2)=a1≼a2∧b1≼b2 plus the path-dependent rule that EJT talks about in the post (Ah yes, non-pumpable non-EU agents might have higher complexity! Is that relevant to the point you’re making?).
What’s the competitive advantage of the EU agent? If I put them both in a sandbox universe and crank up their intelligence, how does the EU agent eat the non-EU agent? How confident are you that that is what must occur?
Hey, I’m really sorry if I sound stupid, because I’m very new to all this, but I have a few questions (also, I don’t know which one of all of you is right, I genuinely have no idea).
Aren’t rocks inherently coherent, or rather, their parts are inherently coherent, for they align with the laws of the universe, whereas the “rock” is just some composite abstract form we came up with, as observers?
Can’t we think of the universe in itself as an “agent” not in the sense of it being “god”, but in the sense of it having preferences and acting on them?
Examples would be hot things liking to be apart and dispersion leading to coldness, or put more abstractly—one of the “preferences” of the universe is entropy. I’m sorry if I’m missing something super obvious, I failed out of university, haha!
If we let the “universe” be an agent in itself, so essentially it’s a composite of all simples there are (even the ones we’re not aware of), then all smaller composites by definition will adhere to the “preferences” of the “universe”, because from our current understanding of science, it seems like the “preferences” (laws) of the “universe” do not change when you cut the universe in half, unless you reach quantum scales, but even then, it is my unfounded suspicion that our previous models are simply laughably wrong, instead of the universe losing homogeneity at some arbitrary scale.
Of course, the “law” of the “universe” is very simple and uncomplex—it is akin to the most powerful “intelligence” or “agent” there is, but with the most “primitive” and “basic” “preferences”. Also apologies for using so many words in quotations, I do so, because I am unsure if I understand their intended meaning.
It seems to me that you could say that we’re all ultimately “dominated” by the “universe” itself, but in a way that’s not really escapeable, but in opposite, the “universe” is also “dominated” by more complex “agents”, as individuals can make sandwiches, while it’d take the “universe” much more time to create such complex and abstract composites from its pure “preferences”.
In a way, to me at least, it seems that both the “hyper-intelligent”, “powerful” “agent” needs the “complex”, “non-homogeneous”, “stupid” “agent”, because without that relationship, if there ever randomly came to exist a “non-homogeneous” “agent” with enough “intelligence” to “dominate” the “universe”, then we’d essentially experience… uh, give me a second, because this is a very complicated concept I read about long ago...
We’d experience the drop in the current energy levels all around the “universe”, because if the “universe” wasn’t the most “powerful” “agent” so far, then we’ve been existing in a “false vacuum”—essentially, the “universe” would be “dominated” by a “better” “agent” that adheres closer to the “true” “preferences” of the “universe”.
And the “preference” of the “true” “universe” seems to be to reach that “true vacuum” state, as it’s more in line with entropy, but it needs smaller and dumber agents that are essentially unknowingly “preferring” to “destroy” the universe as they know it, because it doesn’t seem to be possible to reach that state with only micro-perturbations, or it’d take such a long time, it’s more entropically sound to create bigger agents, that while really stupid, have far more “power” than the simple “universe”, because even though the simple agents do not grasp the nature of “fire”, “cold”, “entropy” or even “time”, they can easily make “sandwiches”, “chairs”, “rockets”, “civilizations” and “technology”.
I’d really appreciate it if someone tried to explain my confusions on the subject in private messages, as the thread here is getting very hard to read (at least for me, I’m very stupid!).
I really appreciate it if you read through my entire nonsensical garble, I hope someone’s charitable enough to enlighten me which assumptions I made are completely nonsensical.
I am not trying to be funny, snarky, ironic, sarcastic, I genuinely do not understand, I just found this website—sorry if I come off that way.
Have a great day!
The question is how to identify particular bubbles of seekingness in the universe. How can you tell which part of the universe will respond to changes in other parts’ shape by reshaping them, and how? How do you know when a cell wants something, in the sense that if the process of getting the thing is interfered with, it will generate physical motions that end up compensating for the interference. How do you know if it wants the thing, if it responds differently to different sizes of interference? Can we identify conflict between two bubbles of seekingness? etc.
The key question is how to identify when a physical has a preference for one thing over another. The hope is that, if we find a sufficiently coherent causal mechanism description that specifies what physical systems qualify as
For what it’s worth, I think you’re on a really good track here, and I’m very excited about views that have the one you’re starting with. I’d invite browsing my account and links, as this is something I talk about often, from various perspectives, though mostly I defer to others for getting the math right.
Speaking of getting the math right: read Discovering Agents (or browse related papers), it’s a really great paper. it’s not an easy first paper to read, but I’m a big believer in out-of-order learning and jumping way ahead of your current level to get a sense of what’s out there. Also check out the related paper Interpreting systems as solving POMDPs (or browse ) related papers.
If you’re also new to scholarship in general, I’d also suggest checking out some stuff on how to do scholarship efficiently as well. a friend and I trimmed an old paper I like on how to read papers efficiently, and posted it to LW the other day. You can also find more related stuff from the tags on that post. (I reference that article myself occasionally and find myself surprised by how dense it is as a checklist of visits if I’m trying to properly understand a paper.)
I’ll read the papers once I get on the computer—don’t worry, I may have not finished uni, but I always loved reading papers over a cup of tea.
I’m kind of writing about this subject right now, so maybe there you can find something that interests you.
How do I know what parts of the universe will respond to what changes? To me, at least, this seems like a mostly false question, for you to have true knowledge of that, you’d need to become the Universe itself. If you don’t care about true knowledge just good % chances, then you do it with heuristic. First you come up with composites that are somewhat self similar, but nothing is exactly alike in the Universe, except the Universe itself. Then you create a heuristic for predicting those composites and you use it, as long as the composite is similar enough to the original composite that the heuristic was based on. Of course, heuristics work differently in different environments, but often there are only a few environments even relevant for each composite, for if you take a fish out of water, it will die—now you may want a heuristic for an alive fish in the air, but I see it as much more useful to recompile the fish into catch at that point.
This of course applies on any level of composition, from specific specimens of fish, to ones from a specific family, to a single species, then to all fish, then to all living organisms, with as many steps in between these listed as you want. How do we discriminate between which composite level we ought to work with? Pure intuition and experiment, once you do it with logic, it all becomes useless, because logic will attempt to compression everything, even those things which have more utility being uncompressed.
I’ll get to the rest of your comment on PC, my fingers hurt. Typing on this new big phone is so hard lol.