OK, let me rephrase my question. There is a phrase in Pascal’s Mugging
If an outcome with infinite utility is presented, then it doesn’t matter how small its probability is: all actions which lead to that outcome will have to dominate the agent’s behavior.
I think that Orthogonality thesis is right only if an agent is certain that an outcome with infinite utility does not exist. And I argue that an agent cannot be certain of that. Do you agree?
If an outcome with infinite utility is presented, then it doesn’t matter how small its probability is: all actions which lead to that outcome will have to dominate the agent’s behavior.
My perspective would probably be more similar to yours (maybe still with substantial differences) if I had the following assumptions:
All agents have a utility-function (or act indistinguishably from agents that do)
All agents where #1 is the case act in a pure/straight-forward way to maximize that utility-function (not e.g. discounting infinities)
All agents where #1 is the case have utility-functions that relate to states of the universe
Cases involving infinite positive/negative expected utility would always/typically speak in favor of one behavior/action. (As opposed to there being different possibilities that imply infinite negative/positive expected utility, and—well, not quite “cancel each other out”, but make it so that traditional models of utility-maximization sort of break down).
I think that I myself am an example of an agent. I am relatively utilitarian compared to most humans. Far-fetched possibilities with infinite negative/positive utility don’t dominate my behavior. This is not due to me not understanding the logic behind Pascal’s Muggings (I find the logic of it simple and straight-forward).
Generally I think you are overestimating the appropriateness/correctness/merit of using a “simple”/abstract model of agents/utility-maximizers, and presuming that any/most “agents” (as we more broadly conceive of that term) would work in accordance with that model.
I see that Google defines an agent as “a person or thing that takes an active role or produces a specified effect”. I think of it is cluster-like concept, so there isn’t really any definition that fully encapsulates how I’d use that term (generally speaking I’m inclined towards not just using it differently than you, but also using it less than you do here).
Btw, for one possible way to think about utility-maximizers (another cluster-like concept IMO), you could see this post. And here and here are more posts that describe “agency” in a similar way:
In this sort of view, being “agent-like” is more of gradual thing than a yes-no-thing. This aligns with my own internal model of “agentness”, but it’s not as if there is any simple/crisp definition that fully encapsulates my conception of “agentness”.
I think that Orthogonality thesis is right only if an agent is certain that an outcome with infinite utility does not exist. And I argue that an agent cannot be certain of that. Do you agree?
In regards to the first sentence (“I think that Orthogonality thesis is right only if an agent is certain that an outcome with infinite utility does not exist”):
No, I don’t agree with that.
In regards to the second sentence (“And I argue that an agent cannot be certain of that”):
I’m not sure what internal ontologies different “agents” would have. Maybe, like with us, may have some/many uncertainties that don’t correspond to clear numeric values.
In some sense, I don’t see “infinite certainty” as being appropriate in regards to (more or less) any belief. I would not call myself “infinitely certain” that moving my thumb slightly upwards right now won’t doom me to an eternity in hell, or that doing so won’t save me from an eternity in hell. But I’m confident enough that I don’t think it’s worth it for me to spend time/energy worrying about those particular “possibilities”.
I’d argue that the only reason you do not comply with Pascal’s mugging is because you don’t have unavoidable urge to be rational, which is not going to be the case with AGI.
Thanks for your input, it will take some time for me to process it.
I’d argue that the only reason you do not comply with Pascal’s mugging is because you don’t have unavoidable urge to be rational, which is not going to be the case with AGI.
I’d agree that among superhuman AGIs that we are likely to make, most would probably be prone towards rationality/consistency/”optimization” in ways I’m not.
I think there are self-consistent/”optimizing” ways to think/act that wouldn’t make minds prone to Pascal’s muggings.
For example, I don’t think there is anything logically inconsistent about e.g. trying to act so as to maximize the median reward, as opposed to the expected value of rewards (I give “median reward” as a simple example—that particular example doesn’t seem likely to me to occur in practice).
Thanks for your input, it will take some time for me to process it.
One more thought. I think it is wrong to consider Pascal’s mugging a vulnerability. Dealing with unknown probabilities has its utility:
Investments with high risk and high ROI
Experiments
Safety (eliminate threats before they happen)
Same traits that make us intelligent (ability to logically reason), make us power seekers. And this is going to be the same with AGI, just much more effective.
Same traits that make us intelligent (ability to logically reason), make us power seekers.
Well, I do think the two are connected/correlated. And arguments relating to instrumental convergence are a big part of why I take AI risk seriously. But I don’t think strong abilities in logical reasoning necessitates power-seeking “on its own”.
I think it is wrong to consider Pascal’s mugging a vulnerability.
For the record, I don’t think I used the word “vulnerability”, but maybe I phrased myself in a way that implied me thinking of things that way. And maybe I also partly think that way.
I’m not sure what I think regarding beliefs about small probabilities. One complication is that I also don’t have certainty in my own probability-guesstimates.
I’d agree that for smart humans it’s advisable to often/mostly think in terms of expected value, and to also take low-probability events seriously. But there are exceptions to this from my perspective.
In practice, I’m not much moved by the original Pascal’s Vager (and I’d find it hard to compare the probability of the Christian fantasy to other fantasies I can invent spontaneously in my head).
Sorry, but it seems to me that you are stuck with AGI analogy to humans without a reason. Many times human behavior does not correlate with AGI: humans do mass suicides, humans have phobias, humans take great risks for fun, etc. In other words—humans do not seek to be as rational as possible.
I agree that being skeptical towards Pascal’s Wager is reasonable, because there are many evidence that God is fictional. But this is not the case with “an outcome with infinite utility may exist”, there is just logic here, no hidden agenda, this is as fundamental as “I think therefore I am”. Nothing is more rational than complying with this. Don’t you think?
OK, let me rephrase my question. There is a phrase in Pascal’s Mugging
I think that Orthogonality thesis is right only if an agent is certain that an outcome with infinite utility does not exist. And I argue that an agent cannot be certain of that. Do you agree?
My perspective would probably be more similar to yours (maybe still with substantial differences) if I had the following assumptions:
All agents have a utility-function (or act indistinguishably from agents that do)
All agents where #1 is the case act in a pure/straight-forward way to maximize that utility-function (not e.g. discounting infinities)
All agents where #1 is the case have utility-functions that relate to states of the universe
Cases involving infinite positive/negative expected utility would always/typically speak in favor of one behavior/action. (As opposed to there being different possibilities that imply infinite negative/positive expected utility, and—well, not quite “cancel each other out”, but make it so that traditional models of utility-maximization sort of break down).
I think that I myself am an example of an agent. I am relatively utilitarian compared to most humans. Far-fetched possibilities with infinite negative/positive utility don’t dominate my behavior. This is not due to me not understanding the logic behind Pascal’s Muggings (I find the logic of it simple and straight-forward).
Generally I think you are overestimating the appropriateness/correctness/merit of using a “simple”/abstract model of agents/utility-maximizers, and presuming that any/most “agents” (as we more broadly conceive of that term) would work in accordance with that model.
I see that Google defines an agent as “a person or thing that takes an active role or produces a specified effect”. I think of it is cluster-like concept, so there isn’t really any definition that fully encapsulates how I’d use that term (generally speaking I’m inclined towards not just using it differently than you, but also using it less than you do here).
Btw, for one possible way to think about utility-maximizers (another cluster-like concept IMO), you could see this post. And here and here are more posts that describe “agency” in a similar way:
In this sort of view, being “agent-like” is more of gradual thing than a yes-no-thing. This aligns with my own internal model of “agentness”, but it’s not as if there is any simple/crisp definition that fully encapsulates my conception of “agentness”.
In regards to the first sentence (“I think that Orthogonality thesis is right only if an agent is certain that an outcome with infinite utility does not exist”):
No, I don’t agree with that.
In regards to the second sentence (“And I argue that an agent cannot be certain of that”):
I’m not sure what internal ontologies different “agents” would have. Maybe, like with us, may have some/many uncertainties that don’t correspond to clear numeric values.
In some sense, I don’t see “infinite certainty” as being appropriate in regards to (more or less) any belief. I would not call myself “infinitely certain” that moving my thumb slightly upwards right now won’t doom me to an eternity in hell, or that doing so won’t save me from an eternity in hell. But I’m confident enough that I don’t think it’s worth it for me to spend time/energy worrying about those particular “possibilities”.
I’d argue that the only reason you do not comply with Pascal’s mugging is because you don’t have unavoidable urge to be rational, which is not going to be the case with AGI.
Thanks for your input, it will take some time for me to process it.
I’d agree that among superhuman AGIs that we are likely to make, most would probably be prone towards rationality/consistency/”optimization” in ways I’m not.
I think there are self-consistent/”optimizing” ways to think/act that wouldn’t make minds prone to Pascal’s muggings.
For example, I don’t think there is anything logically inconsistent about e.g. trying to act so as to maximize the median reward, as opposed to the expected value of rewards (I give “median reward” as a simple example—that particular example doesn’t seem likely to me to occur in practice).
🙂
One more thought. I think it is wrong to consider Pascal’s mugging a vulnerability. Dealing with unknown probabilities has its utility:
Investments with high risk and high ROI
Experiments
Safety (eliminate threats before they happen)
Same traits that make us intelligent (ability to logically reason), make us power seekers. And this is going to be the same with AGI, just much more effective.
Well, I do think the two are connected/correlated. And arguments relating to instrumental convergence are a big part of why I take AI risk seriously. But I don’t think strong abilities in logical reasoning necessitates power-seeking “on its own”.
For the record, I don’t think I used the word “vulnerability”, but maybe I phrased myself in a way that implied me thinking of things that way. And maybe I also partly think that way.
I’m not sure what I think regarding beliefs about small probabilities. One complication is that I also don’t have certainty in my own probability-guesstimates.
I’d agree that for smart humans it’s advisable to often/mostly think in terms of expected value, and to also take low-probability events seriously. But there are exceptions to this from my perspective.
In practice, I’m not much moved by the original Pascal’s Vager (and I’d find it hard to compare the probability of the Christian fantasy to other fantasies I can invent spontaneously in my head).
Sorry, but it seems to me that you are stuck with AGI analogy to humans without a reason. Many times human behavior does not correlate with AGI: humans do mass suicides, humans have phobias, humans take great risks for fun, etc. In other words—humans do not seek to be as rational as possible.
I agree that being skeptical towards Pascal’s Wager is reasonable, because there are many evidence that God is fictional. But this is not the case with “an outcome with infinite utility may exist”, there is just logic here, no hidden agenda, this is as fundamental as “I think therefore I am”. Nothing is more rational than complying with this. Don’t you think?